paleolimbot commented on PR #240:
URL: https://github.com/apache/parquet-format/pull/240#issuecomment-2546234738

   > separate from the spec it might be good to start discussions on what the 
implementation for GeoParquet might look like (e.g. what new dependencies do we 
plan on taking on for reference implementation? What would APIs look like?)
   
   @emkornfield I think this is a good idea! The PoC implementations 
specifically may not handle writing statistics for non-planar edges depending 
on the final call on whether the statistics are always Cartesian min/max (i.e., 
lying for spherical edges and should be ignored), or whether the statistics 
take into curved edges for the non-planar case (requires non-trivial 
computational effort and complexity on behalf of the writer, but eliminates 
computational effort and complexity for the reader). Discussions in Iceberg 
have converged on the latter, which means we may have to figure out how to plug 
in S2 and/or Boost::Geometry when writing statistics in C++ (I can't speak for 
Java). Off the top of my head it could either be a Parquet-specific hook to 
override stats for a column chunk, the name of an Arrow compute UDF that can 
compute the required box, or willingness to put Boost or s2 as a dependency in 
that section of the code. (I don't think that's required for PoC , personally, 
but 
 I'm also happy to prototype any of those if somebody does).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@parquet.apache.org
For additional commands, e-mail: issues-h...@parquet.apache.org

Reply via email to