This is an automated email from the ASF dual-hosted git repository.
jiayu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/sedona.git
The following commit(s) were added to refs/heads/master by this push:
new 59fc2e0c07 [GH-2407] Auto-detect raster columns in
SedonaUtils.display_image (#2633)
59fc2e0c07 is described below
commit 59fc2e0c07d04cc25dd13604249bb1591b055fa3
Author: Jia Yu <[email protected]>
AuthorDate: Mon Feb 9 03:55:04 2026 -0700
[GH-2407] Auto-detect raster columns in SedonaUtils.display_image (#2633)
---
docs/api/sql/Raster-loader.md | 3 +++
docs/api/sql/Raster-map-algebra.md | 3 +++
docs/api/sql/Raster-operators.md | 3 +++
docs/api/sql/Raster-visualizer.md | 19 +++++++++----
docs/tutorial/raster.md | 20 ++++++++++++++
python/sedona/spark/raster_utils/SedonaUtils.py | 33 +++++++++++++++++++++++
python/tests/raster_viz_utils/test_sedonautils.py | 25 ++++++++++++++---
7 files changed, 97 insertions(+), 9 deletions(-)
diff --git a/docs/api/sql/Raster-loader.md b/docs/api/sql/Raster-loader.md
index f41d6c6ae9..d0df8c2231 100644
--- a/docs/api/sql/Raster-loader.md
+++ b/docs/api/sql/Raster-loader.md
@@ -22,6 +22,9 @@
The raster loader of Sedona leverages Spark built-in binary data source and
works with several RS constructors to produce Raster type. Each raster is a row
in the resulting DataFrame and stored in a `Raster` format.
+!!!tip
+ After loading rasters, you can quickly visualize them in a Jupyter
notebook using `SedonaUtils.display_image(df)`. It automatically detects raster
columns and renders them as images. See [Raster visualizer
docs](Raster-visualizer.md#display-raster-in-jupyter) for details.
+
By default, these functions uses lon/lat order since `v1.5.0`. Before, it used
lat/lon order.
## Step 1: Load raster to a binary DataFrame
diff --git a/docs/api/sql/Raster-map-algebra.md
b/docs/api/sql/Raster-map-algebra.md
index 70b7e52eef..a8431f9dd6 100644
--- a/docs/api/sql/Raster-map-algebra.md
+++ b/docs/api/sql/Raster-map-algebra.md
@@ -21,6 +21,9 @@
Map algebra is a way to perform raster calculations using mathematical
expressions. The expression can be a simple arithmetic operation or a complex
combination of multiple operations. The expression can be applied to a single
raster band or multiple raster bands. The result of the expression is a new
raster.
+!!!tip
+ To visually inspect the result of a map algebra operation in a Jupyter
notebook, use `SedonaUtils.display_image(df)`. It automatically detects raster
columns and renders them as images. See [Raster visualizer
docs](Raster-visualizer.md#display-raster-in-jupyter) for details.
+
Apache Sedona provides two ways to perform map algebra operations:
1. Using the `RS_MapAlgebra` function.
diff --git a/docs/api/sql/Raster-operators.md b/docs/api/sql/Raster-operators.md
index 2408ded8cf..7124286d3c 100644
--- a/docs/api/sql/Raster-operators.md
+++ b/docs/api/sql/Raster-operators.md
@@ -17,6 +17,9 @@
under the License.
-->
+!!!tip
+ To quickly visualize raster data in a Jupyter notebook, use
`SedonaUtils.display_image(df)`. It automatically detects raster columns and
renders them as images. See [Raster visualizer
docs](Raster-visualizer.md#display-raster-in-jupyter) for details.
+
## Pixel Functions
### RS_PixelAsCentroid
diff --git a/docs/api/sql/Raster-visualizer.md
b/docs/api/sql/Raster-visualizer.md
index ce58eb986c..f8bcab5eac 100644
--- a/docs/api/sql/Raster-visualizer.md
+++ b/docs/api/sql/Raster-visualizer.md
@@ -71,21 +71,30 @@ Output:
"<img
src=\"\"
width=\"200\" />";
```
-!!!Tip
- RS_AsImage can be paired with SedonaUtils.display_image(df) wrapper inside
a Jupyter notebook to directly print the raster as an image in the output,
where the 'df' parameter is the dataframe containing the HTML data provided by
RS_AsImage
+### Display raster in Jupyter
-Example:
+Introduction: `SedonaUtils.display_image(df)` is a Python wrapper that renders
raster images directly in a Jupyter notebook. It automatically detects raster
columns in the DataFrame and applies `RS_AsImage` under the hood, so you don't
need to call `RS_AsImage` yourself. You can also pass a DataFrame with
pre-applied `RS_AsImage` HTML.
+
+Since: `v1.7.0` (auto-detection of raster columns since `v1.9.0`)
+
+Example — direct raster display (recommended):
```python
from sedona.spark import SedonaUtils
-# Or from sedona.spark import *
-
df = (
sedona.read.format("binaryFile")
.load(DATA_DIR + "raster.tiff")
.selectExpr("RS_FromGeoTiff(content) as raster")
)
+
+# Pass the raw raster DataFrame directly — RS_AsImage is applied automatically
+SedonaUtils.display_image(df)
+```
+
+Example — with explicit RS_AsImage:
+
+```python
htmlDF = df.selectExpr("RS_AsImage(raster, 500) as raster_image")
SedonaUtils.display_image(htmlDF)
```
diff --git a/docs/tutorial/raster.md b/docs/tutorial/raster.md
index bb776b6b3b..717b688ab5 100644
--- a/docs/tutorial/raster.md
+++ b/docs/tutorial/raster.md
@@ -472,6 +472,17 @@ The output looks like this:

+!!!tip
+ In a Jupyter notebook, use `SedonaUtils.display_image` to render rasters
directly — no need to call RS_AsImage manually:
+
+ ```python
+ from sedona.spark import SedonaUtils
+
+ SedonaUtils.display_image(rasterDf)
+ ```
+
+ See [Display raster in
Jupyter](../api/sql/Raster-visualizer.md#display-raster-in-jupyter) for details.
+
### 2-D Matrix
Sedona offers an API to visualize raster data that is not sufficient for the
other APIs mentioned above.
@@ -531,6 +542,15 @@ Please refer to [Raster writer
docs](../api/sql/Raster-writer.md) for more detai
Sedona allows collecting Dataframes with raster columns and working with them
locally in Python since `v1.6.0`.
The raster objects are represented as `SedonaRaster` objects in Python, which
can be used to perform raster operations.
+!!!tip
+ If you just want to quickly visualize a raster in Jupyter, use
`SedonaUtils.display_image(df)` instead of collecting the DataFrame:
+
+ ```python
+ from sedona.spark import SedonaUtils
+
+ SedonaUtils.display_image(df_raster)
+ ```
+
```python
df_raster = (
sedona.read.format("binaryFile")
diff --git a/python/sedona/spark/raster_utils/SedonaUtils.py
b/python/sedona/spark/raster_utils/SedonaUtils.py
index f292bd490a..11dba5868f 100644
--- a/python/sedona/spark/raster_utils/SedonaUtils.py
+++ b/python/sedona/spark/raster_utils/SedonaUtils.py
@@ -21,7 +21,40 @@ from sedona.spark.maps.SedonaMapUtils import SedonaMapUtils
class SedonaUtils:
@classmethod
def display_image(cls, df):
+ """Display raster images in a Jupyter notebook.
+
+ Accepts DataFrames with either:
+ - A raster column (GridCoverage2D) — auto-applies RS_AsImage
+ - An HTML image column from RS_AsImage() — renders directly
+
+ Falls back to the SedonaMapUtils HTML table path for other DataFrames.
+ """
from IPython.display import HTML, display
+ schema = df.schema
+
+ # Detect raster UDT columns and auto-apply RS_AsImage.
+ # Without this, passing a raw raster DataFrame to the fallback path
+ # causes __convert_to_gdf_or_pdf__ to Arrow-serialize the full raster
+ # grid, which hangs for large rasters (e.g., 1400x800).
+ raster_cols = [
+ f.name
+ for f in schema.fields
+ if hasattr(f.dataType, "typeName") and f.dataType.typeName() ==
"rastertype"
+ ]
+ if raster_cols:
+ # Replace each raster column with its RS_AsImage() HTML
representation,
+ # preserving all other columns in the DataFrame.
+ select_exprs = []
+ for f in schema.fields:
+ col_name_escaped = f.name.replace("`", "``")
+ if f.name in raster_cols:
+ select_exprs.append(
+ f"RS_AsImage(`{col_name_escaped}`) as
`{col_name_escaped}`"
+ )
+ else:
+ select_exprs.append(f"`{col_name_escaped}`")
+ df = df.selectExpr(*select_exprs)
+
pdf = SedonaMapUtils.__convert_to_gdf_or_pdf__(df, rename=False)
display(HTML(pdf.to_html(escape=False)))
diff --git a/python/tests/raster_viz_utils/test_sedonautils.py
b/python/tests/raster_viz_utils/test_sedonautils.py
index f3c8dae994..2f8993b593 100644
--- a/python/tests/raster_viz_utils/test_sedonautils.py
+++ b/python/tests/raster_viz_utils/test_sedonautils.py
@@ -26,12 +26,29 @@ class TestSedonaUtils(TestBase):
raster_bin_df = self.spark.read.format("binaryFile").load(
world_map_raster_input_location
)
- raster_bin_df.createOrReplaceTempView("raster_binary_table")
- raster_df = self.spark.sql(
- "SELECT RS_FromGeotiff(content) as raster from raster_binary_table"
- )
+ raster_df = raster_bin_df.selectExpr("RS_FromGeotiff(content) as
raster")
raster_image_df = raster_df.selectExpr("RS_AsImage(raster) as
rast_img")
html_call = SedonaUtils.display_image(raster_image_df)
assert (
html_call is None
) # just test that this function was called and returned no output
+
+ def test_display_image_raw_raster(self):
+ """Test that display_image auto-detects raster columns and applies
RS_AsImage."""
+ raster_bin_df = self.spark.read.format("binaryFile").load(
+ world_map_raster_input_location
+ )
+ raster_df = raster_bin_df.selectExpr("RS_FromGeotiff(content) as
raster")
+ html_call = SedonaUtils.display_image(raster_df)
+ assert html_call is None
+
+ def test_display_image_preserves_non_raster_columns(self):
+ """Test that non-raster columns are preserved alongside raster
columns."""
+ raster_bin_df = self.spark.read.format("binaryFile").load(
+ world_map_raster_input_location
+ )
+ raster_df = raster_bin_df.selectExpr(
+ "path", "RS_FromGeotiff(content) as raster"
+ )
+ html_call = SedonaUtils.display_image(raster_df)
+ assert html_call is None