paleolimbot commented on code in PR #360:
URL: https://github.com/apache/sedona-db/pull/360#discussion_r2561027328


##########
c/sedona-geos/src/st_minimumclearance_line.rs:
##########
@@ -113,7 +113,6 @@ mod tests {
             Some("POLYGON((0 0,0 3,3 3,3 0,0 0),(1 1,1 2,2 2,2 1,1 1))"),
             Some("POLYGON((0 0,0 1,0 1,1 1,1 0,0 0,0 0))"),
             Some("LINESTRING (0 0, 1 1, 2 2)"),
-            Some("MULTIPOLYGON(((0.5 0.5,0 0,0 1,0.5 0.5)),((0.5 0.5,1 1,1 
0,0.5 0.5)),((2.5 2.5,2 2,2 3,2.5 2.5)),((2.5 2.5,3 3,3 2,2.5 2.5)))"),

Review Comment:
   We already have a multipolygon test case here and this test fails for me 
locally (we've also removed these from Python)



##########
c/sedona-proj/src/sd_order_lnglat.rs:
##########
@@ -0,0 +1,205 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use std::{fmt::Debug, sync::Arc};
+
+use arrow_array::builder::UInt64Builder;
+use arrow_schema::DataType;
+use datafusion_common::{DataFusionError, Result};
+use datafusion_expr::ColumnarValue;
+use sedona_expr::scalar_udf::SedonaScalarKernel;
+use sedona_functions::executor::WkbBytesExecutor;
+use sedona_geometry::{transform::CrsEngine, wkb_header::WkbHeader};
+use sedona_schema::{crs::lnglat, datatypes::SedonaType, matchers::ArgMatcher};
+
+use crate::st_transform::with_global_proj_engine;
+
+/// Generic scalar kernel for sd_order based on the first coordinate
+/// of a geometry projected to lon/lat
+///
+/// This [SedonaScalarKernel] requires the actual function (e.g., S2, H3,
+/// or A5 cell identifier) to be provided but takes care of the extraction
+/// of the first coordinate and projecting to lon/lat space. The provided
+/// function must return a `u64`.
+pub struct OrderLngLat<F> {
+    order_fn: F,
+}

Review Comment:
   This is the actual ordering implementation from geometry/geography. It's a 
little awkward because we have to pull in two somewhat disjoint dependencies 
(something to project to lon/lat, something to compute a value). The projecting 
needs the global engine so I put it here and parameterized it on the thing 
that's easier to mock (the actual ordering function).
   
   There are definitely more effective orderings that consider the full 
geometry (e.g., xz2); however, this one is good as a default because it's fast 
(doesn't parse or reproject more than one coordinate) and works for geometry, 
geography, and the forthcoming row-level CRS type. Adding other orderable 
hashes (e.g., geohash, xz2) is harder because the compatibility with existing 
behaviour could take some time to replicate.



##########
rust/sedona-functions/src/sd_order.rs:
##########
@@ -0,0 +1,118 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use datafusion_common::Result;
+use datafusion_expr::{
+    scalar_doc_sections::DOC_SECTION_OTHER, ColumnarValue, Documentation, 
Volatility,
+};
+use sedona_expr::scalar_udf::{SedonaScalarKernel, SedonaScalarUDF};
+use sedona_schema::datatypes::SedonaType;
+use std::{fmt::Debug, sync::Arc};
+
+/// SD_Order() scalar UDF implementation
+///
+/// This function is invoked to obtain a proxy array whose order may be used
+/// to sort based on the value. The default implementation returns the value
+/// and a utility is provided to order geometry and/or geographies based on
+/// the first coordinate. More sophisticated sorting (e.g., XZ2) may be added
+/// in the future.
+pub fn sd_order_udf() -> SedonaScalarUDF {

Review Comment:
   This is the main change: we add `sd_order()` whose registered kernels are 
the means by which to order stuff (that is possibly not meaningfully orderable 
with the default implementation).



##########
rust/sedona-schema/src/crs.rs:
##########
@@ -97,10 +97,37 @@ impl PartialEq<dyn CoordinateReferenceSystem + Send + Sync>
 /// A trait defining the minimum required properties of a concrete coordinate
 /// reference system, allowing the details of this to be implemented elsewhere.
 pub trait CoordinateReferenceSystem: Debug {
+    /// Compute the representation of this Crs in the form required for JSON 
output
+    ///
+    /// The output must be valid JSON (e.g., arbitrary strings must be quoted).
     fn to_json(&self) -> String;

Review Comment:
   I noticed these weren't documented so I added some



##########
c/sedona-proj/src/st_transform.rs:
##########
@@ -293,7 +293,7 @@ fn invoke_scalar(wkb: &Wkb, trans: &dyn CrsTransform, 
builder: &mut BinaryBuilde
 fn parse_source_crs(source_type: &SedonaType) -> Result<Option<String>> {
     match source_type {
         SedonaType::Wkb(_, Some(crs)) | SedonaType::WkbView(_, Some(crs)) => {
-            crs.to_authority_code()
+            Ok(Some(crs.to_crs_string()))

Review Comment:
   This fixes a bug in `st_transform()`: previously any CRS that was 
unrepresentable with a single authority/code (e.g., the ns-water test data) 
would have returned `None` here. Because we have `lenient`, `st_transform()` 
would have assumed the source data to be lon/lat even though it was correctly 
annotated.



##########
rust/sedona-functions/src/sd_order.rs:
##########
@@ -0,0 +1,118 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use datafusion_common::Result;
+use datafusion_expr::{
+    scalar_doc_sections::DOC_SECTION_OTHER, ColumnarValue, Documentation, 
Volatility,
+};
+use sedona_expr::scalar_udf::{SedonaScalarKernel, SedonaScalarUDF};
+use sedona_schema::datatypes::SedonaType;
+use std::{fmt::Debug, sync::Arc};
+
+/// SD_Order() scalar UDF implementation
+///
+/// This function is invoked to obtain a proxy array whose order may be used
+/// to sort based on the value. The default implementation returns the value
+/// and a utility is provided to order geometry and/or geographies based on
+/// the first coordinate. More sophisticated sorting (e.g., XZ2) may be added
+/// in the future.
+pub fn sd_order_udf() -> SedonaScalarUDF {
+    SedonaScalarUDF::new(
+        "sd_order",
+        vec![Arc::new(SDOrderDefault {})],
+        Volatility::Immutable,
+        Some(sd_order_doc()),
+    )
+}
+
+fn sd_order_doc() -> Documentation {
+    Documentation::builder(
+        DOC_SECTION_OTHER,
+        "Return an arbitrary value that may be used to sort the input.",
+        "SD_Order (value: Any)",
+    )
+    .with_argument("value", "Any: An arbitrary value")
+    .with_sql_example("SELECT SD_Order()")
+    .build()
+}
+
+/// Default implementation that returns its input (i.e., by default, just
+/// do whatever DataFusion would have done with the value)
+#[derive(Debug)]
+struct SDOrderDefault {}
+
+impl SedonaScalarKernel for SDOrderDefault {

Review Comment:
   ...where, by default, we just fall back on the storage ordering



##########
rust/sedona-schema/src/crs.rs:
##########
@@ -97,10 +97,37 @@ impl PartialEq<dyn CoordinateReferenceSystem + Send + Sync>
 /// A trait defining the minimum required properties of a concrete coordinate
 /// reference system, allowing the details of this to be implemented elsewhere.
 pub trait CoordinateReferenceSystem: Debug {
+    /// Compute the representation of this Crs in the form required for JSON 
output
+    ///
+    /// The output must be valid JSON (e.g., arbitrary strings must be quoted).
     fn to_json(&self) -> String;
+
+    /// Compute the representation of this Crs as a string in the form 
Authority:Code
+    ///
+    /// If there is no such representation, returns None.
     fn to_authority_code(&self) -> Result<Option<String>>;
+
+    /// Compute CRS equality
+    ///
+    /// CRS equality is a relatively thorny topic and can be difficult to 
compute;
+    /// however, this method should try to compare self and other on value 
(e.g.,
+    /// comparing authority_code where possible).
     fn crs_equals(&self, other: &dyn CoordinateReferenceSystem) -> bool;
+
+    /// Reduce this beautiful, rich CRS representation to a mere integer if 
possible
+    ///
+    /// For the purposes of this trait, an SRID is always equivalent to the
+    /// authority_code `"EPSG<srid>"`. Note that other SRID representations
+    /// (e.g., GeoArrow, Parquet GEOMETRY/GEOGRAPHY) do not make any guarantees
+    /// that an SRID comes from the EPSG authority.
     fn srid(&self) -> Result<Option<u32>>;
+
+    /// Compute a CRS string representation
+    ///
+    /// Unlike `to_json()`, arbitrary string values returned by this method 
should
+    /// not be escaped. This is the representation expected as input to PROJ, 
GDAL,
+    /// and Parquet GEOMETRY/GEOGRAPHY representations of CRS.
+    fn to_crs_string(&self) -> String;

Review Comment:
   I added this one because we need it for Parquet GEOMETRY/GEOGRAPHY too (in 
addition to throwing this Crs object into PROJ).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to