paleolimbot commented on code in PR #240:
URL: https://github.com/apache/parquet-format/pull/240#discussion_r1941609853


##########
Geospatial.md:
##########
@@ -0,0 +1,144 @@
+<!--
+  - Licensed to the Apache Software Foundation (ASF) under one
+  - or more contributor license agreements.  See the NOTICE file
+  - distributed with this work for additional information
+  - regarding copyright ownership.  The ASF licenses this file
+  - to you under the Apache License, Version 2.0 (the
+  - "License"); you may not use this file except in compliance
+  - with the License.  You may obtain a copy of the License at
+  -
+  -   http://www.apache.org/licenses/LICENSE-2.0
+  -
+  - Unless required by applicable law or agreed to in writing,
+  - software distributed under the License is distributed on an
+  - "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  - KIND, either express or implied.  See the License for the
+  - specific language governing permissions and limitations
+  - under the License.
+  -->
+
+Geospatial Definitions
+====
+
+This document contains the specification of geospatial types and statistics.
+
+# Background
+
+The Geometry and Geography class hierarchy and its Well-Known Text (WKT) and
+Well-Known Binary (WKB) serializations (ISO supporting XY, XYZ, XYM, XYZM) are
+defined by [OpenGIS Implementation Specification for Geographic information –
+Simple feature access – Part 1: Common architecture][sfa-part1], from [OGC
+(Open Geospatial Consortium)][ogc].
+
+The version of the OGC standard first used here is 1.2.1, but future versions
+may also be used if the WKB representation remains wire-compatible.
+
+[sfa-part1]: https://portal.ogc.org/files/?artifact_id=25355
+[ogc]: https://www.ogc.org/standard/sfa/
+
+## Coordinate Reference System
+
+Coordinate Reference System (CRS) is a mapping of how coordinates refer to
+locations on Earth.
+
+The default CRS `OGC:CRS84` means that the objects must be stored in longitude,
+latitude based on the WGS84 datum.
+
+Custom CRS can be specified by a string value. It is recommended to use the
+identifier of the CRS like [Spatial reference identifier][srid] and 
[PROJJSON][projjson].

Review Comment:
   Yes, PROJJSON optionally embeds an identifier in its JSON structure if the 
CRS has one (however, some of the data we are trying to convince large 
organizations/governments to distribute in Parquet don't have an authority/code 
and some require more than one authority/code to specify the CRS for the x-y 
separately from the z).
   
   Because we've gone in quite a few circles on this one, my preference is just 
a string representation of the CRS with no further specification (i.e., 
writer/reader is responsible for serializing and deserializing the CRS, 
respectively).
   
   If that isn't acceptable, I would add "writers should write the most compact 
form of CRS that fully describes the CRS. Identifiers should be used where 
possible and written in the form authority:code (e.g., `OGC:CRS84` to specify 
longitude/latitude on the WGS84 ellipsoid)." That definition would result in 
99.9% of geometry columns having a compact (but self-contained) CRS definition 
(authority:code), while also allowing producers to write whatever an upstream 
library provided them.
   
   Barring either of those being acceptable, I would just make the 
`projjson:some_schema_metadata_field` language explicit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@parquet.apache.org
For additional commands, e-mail: issues-h...@parquet.apache.org

Reply via email to