wgtmac commented on code in PR #2971:
URL: https://github.com/apache/parquet-java/pull/2971#discussion_r2038959009
##########
parquet-column/src/main/java/org/apache/parquet/column/ParquetProperties.java:
##########
@@ -65,6 +65,8 @@ public class ParquetProperties {
public static final int DEFAULT_BLOOM_FILTER_CANDIDATES_NUMBER = 5;
public static final boolean DEFAULT_STATISTICS_ENABLED = true;
public static final boolean DEFAULT_SIZE_STATISTICS_ENABLED = true;
+ public static final boolean DEFAULT_GEO_STATISTICS_ENABLED = true;
Review Comment:
```suggestion
```
It looks duplicate and please delete any one.
##########
parquet-cli/src/main/java/org/apache/parquet/cli/commands/ShowGeospatialStatisticsCommand.java:
##########
@@ -0,0 +1,110 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.parquet.cli.commands;
+
+import com.beust.jcommander.Parameter;
+import com.beust.jcommander.Parameters;
+import com.google.common.base.Preconditions;
+import com.google.common.collect.Lists;
+import java.io.IOException;
+import java.util.List;
+import org.apache.commons.text.TextStringBuilder;
+import org.apache.parquet.cli.BaseCommand;
+import org.apache.parquet.column.statistics.geometry.GeospatialStatistics;
+import org.apache.parquet.hadoop.ParquetFileReader;
+import org.apache.parquet.hadoop.metadata.BlockMetaData;
+import org.apache.parquet.hadoop.metadata.ColumnChunkMetaData;
+import org.apache.parquet.hadoop.metadata.ParquetMetadata;
+import org.apache.parquet.schema.MessageType;
+import org.slf4j.Logger;
+
+@Parameters(commandDescription = "Print geospatial statistics for a Parquet
file")
+public class ShowGeospatialStatisticsCommand extends BaseCommand {
Review Comment:
Thanks for adding this command! Please also update parquet-cli/README to
reflect this new option.
##########
parquet-column/src/main/java/org/apache/parquet/column/impl/ColumnValueCollector.java:
##########
@@ -60,6 +64,9 @@ void resetPageStatistics() {
path.getPrimitiveType(), path.getMaxRepetitionLevel(),
path.getMaxDefinitionLevel())
: SizeStatistics.noopBuilder(
path.getPrimitiveType(), path.getMaxRepetitionLevel(),
path.getMaxDefinitionLevel());
+ this.geospatialStatisticsBuilder = geospatialStatisticsEnabled
Review Comment:
ditto, we can reuse `statisticsEnabled` for this
##########
parquet-column/src/main/java/org/apache/parquet/column/ParquetProperties.java:
##########
@@ -114,6 +116,7 @@ public static WriterVersion fromString(String name) {
private final int statisticsTruncateLength;
private final boolean statisticsEnabled;
private final boolean sizeStatisticsEnabled;
+ private final boolean geospatialStatisticsEnabled;
Review Comment:
Why not simply reusing `statisticsEnabled`? It seems that we don't need an
extra flag for a single type. The C++ impl did this exactly. If you agree, we
can revert all changes in this file.
##########
pom.xml:
##########
@@ -94,7 +94,7 @@
<shade.prefix>shaded.parquet</shade.prefix>
<!-- Guarantees no newer classes/methods/constants are used by parquet. -->
<hadoop.version>3.3.0</hadoop.version>
- <parquet.format.version>2.10.0</parquet.format.version>
+ <parquet.format.version>2.11.0</parquet.format.version>
Review Comment:
It is time to rebase this PR :)
##########
parquet-column/src/main/java/org/apache/parquet/schema/LogicalTypeAnnotation.java:
##########
@@ -147,6 +151,30 @@ protected LogicalTypeAnnotation fromString(List<String>
params) {
return float16Type();
}
},
+ GEOMETRY {
+ @Override
+ protected LogicalTypeAnnotation fromString(List<String> params) {
+ if (params.size() < 1) {
+ throw new RuntimeException(
+ "Expecting at least 1 parameter for geometry logical type, got "
+ params.size());
+ }
+ String crs = params.size() > 0 ? params.get(0) : null;
+ return geometryType(crs);
+ }
+ },
+ GEOGRAPHY {
+ @Override
+ protected LogicalTypeAnnotation fromString(List<String> params) {
+ if (params.size() < 1) {
Review Comment:
ditto
##########
parquet-column/src/main/java/org/apache/parquet/schema/LogicalTypeAnnotation.java:
##########
@@ -1128,6 +1172,140 @@ public int hashCode() {
}
}
+ public static class GeometryLogicalTypeAnnotation extends
LogicalTypeAnnotation {
+ private final String crs;
+
+ private GeometryLogicalTypeAnnotation(String crs) {
+ this.crs = crs;
+ }
+
+ @Override
+ @Deprecated
+ public OriginalType toOriginalType() {
+ return null;
+ }
+
+ @Override
+ public <T> Optional<T> accept(LogicalTypeAnnotationVisitor<T>
logicalTypeAnnotationVisitor) {
+ return logicalTypeAnnotationVisitor.visit(this);
+ }
+
+ @Override
+ LogicalTypeToken getType() {
+ return LogicalTypeToken.GEOMETRY;
+ }
+
+ @Override
+ protected String typeParametersAsString() {
+ StringBuilder sb = new StringBuilder();
+ sb.append("(");
+ sb.append(",");
Review Comment:
```suggestion
```
##########
parquet-column/src/main/java/org/apache/parquet/column/statistics/geometry/GeospatialTypes.java:
##########
@@ -0,0 +1,192 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.parquet.column.statistics.geometry;
+
+import java.util.HashSet;
+import java.util.Set;
+import java.util.stream.Collectors;
+import org.apache.parquet.Preconditions;
+import org.locationtech.jts.geom.Coordinate;
+import org.locationtech.jts.geom.Geometry;
+
+public class GeospatialTypes {
+
+ private static final int UNKNOWN_TYPE_ID = -1;
+ private Set<Integer> types = new HashSet<>();
+ private boolean valid = true;
+
+ public GeospatialTypes(Set<Integer> types) {
+ this.types = types;
+ }
+
+ public GeospatialTypes() {}
+
+ public Set<Integer> getTypes() {
+ return types;
+ }
+
+ void update(Geometry geometry) {
+ if (!valid) {
+ return;
+ }
+ int code = getGeometryTypeCode(geometry);
+ if (code != UNKNOWN_TYPE_ID) {
+ types.add(code);
+ } else {
+ valid = false;
+ types.clear();
+ }
+ }
+
+ public void merge(GeospatialTypes other) {
+ Preconditions.checkArgument(other != null, "Cannot merge with null
GeospatialTypes");
+ if (!valid) {
+ return;
+ }
+ if (!other.valid) {
+ valid = false;
+ types.clear();
+ return;
+ }
+ types.addAll(other.types);
+ }
+
+ public void reset() {
+ types.clear();
+ valid = true;
+ }
+
+ public void abort() {
+ valid = false;
+ types.clear();
+ }
+
+ public GeospatialTypes copy() {
+ return new GeospatialTypes(new HashSet<>(types));
+ }
+
+ @Override
+ public String toString() {
+ return "GeospatialTypes{" + "types="
+ + types.stream().map(this::typeIdToString).collect(Collectors.toSet())
+ '}';
+ }
+
+ private int getGeometryTypeId(Geometry geometry) {
+ switch (geometry.getGeometryType()) {
+ case Geometry.TYPENAME_POINT:
+ return 1;
+ case Geometry.TYPENAME_LINESTRING:
+ return 2;
+ case Geometry.TYPENAME_POLYGON:
+ return 3;
+ case Geometry.TYPENAME_MULTIPOINT:
+ return 4;
+ case Geometry.TYPENAME_MULTILINESTRING:
+ return 5;
+ case Geometry.TYPENAME_MULTIPOLYGON:
+ return 6;
+ case Geometry.TYPENAME_GEOMETRYCOLLECTION:
+ return 7;
+ default:
+ return UNKNOWN_TYPE_ID;
+ }
+ }
+
+ /**
+ * Geospatial type codes:
+ *
+ * | Type | XY | XYZ | XYM | XYZM |
+ * | :----------------- | :--- | :--- | :--- | :--: |
+ * | Point | 0001 | 1001 | 2001 | 3001 |
+ * | LineString | 0002 | 1002 | 2002 | 3002 |
+ * | Polygon | 0003 | 1003 | 2003 | 3003 |
+ * | MultiPoint | 0004 | 1004 | 2004 | 3004 |
+ * | MultiLineString | 0005 | 1005 | 2005 | 3005 |
+ * | MultiPolygon | 0006 | 1006 | 2006 | 3006 |
+ * | GeometryCollection | 0007 | 1007 | 2007 | 3007 |
+ *
+ * See
https://github.com/apache/parquet-format/blob/master/Geospatial.md#geospatial-types
+ */
+ private int getGeometryTypeCode(Geometry geometry) {
+ int typeId = getGeometryTypeId(geometry);
+ if (typeId == UNKNOWN_TYPE_ID) {
+ return UNKNOWN_TYPE_ID;
+ }
+ Coordinate[] coordinates = geometry.getCoordinates();
+ boolean hasZ = false;
+ boolean hasM = false;
+ for (Coordinate coordinate : coordinates) {
+ if (!Double.isNaN(coordinate.getZ())) {
+ hasZ = true;
+ }
+ if (!Double.isNaN(coordinate.getM())) {
+ hasM = true;
+ }
+ if (hasZ && hasM) {
+ break;
Review Comment:
If all coordinates are `XY` or `XYZ`, then we will have to loop through all
coordinates, right?
Is it safe to check only the 1st coordinate?
##########
parquet-column/src/main/java/org/apache/parquet/schema/LogicalTypeAnnotation.java:
##########
@@ -1128,6 +1172,140 @@ public int hashCode() {
}
}
+ public static class GeometryLogicalTypeAnnotation extends
LogicalTypeAnnotation {
+ private final String crs;
+
+ private GeometryLogicalTypeAnnotation(String crs) {
+ this.crs = crs;
+ }
+
+ @Override
+ @Deprecated
+ public OriginalType toOriginalType() {
+ return null;
+ }
+
+ @Override
+ public <T> Optional<T> accept(LogicalTypeAnnotationVisitor<T>
logicalTypeAnnotationVisitor) {
+ return logicalTypeAnnotationVisitor.visit(this);
+ }
+
+ @Override
+ LogicalTypeToken getType() {
+ return LogicalTypeToken.GEOMETRY;
+ }
+
+ @Override
+ protected String typeParametersAsString() {
+ StringBuilder sb = new StringBuilder();
+ sb.append("(");
+ sb.append(",");
+ if (crs != null && !crs.isEmpty()) {
+ sb.append(",");
+ sb.append(crs);
+ }
+ sb.append(")");
+ return sb.toString();
+ }
+
+ public String getCrs() {
+ return crs;
+ }
+
+ @Override
+ public boolean equals(Object obj) {
+ if (!(obj instanceof GeometryLogicalTypeAnnotation)) {
+ return false;
+ }
+ GeometryLogicalTypeAnnotation other = (GeometryLogicalTypeAnnotation)
obj;
+ return Objects.equals(crs, other.crs);
+ }
+
+ @Override
+ public int hashCode() {
+ return Objects.hash(crs);
+ }
+
+ @Override
+ PrimitiveStringifier valueStringifier(PrimitiveType primitiveType) {
+ return PrimitiveStringifier.WKB_STRINGIFIER;
+ }
+ }
+
+ public static class GeographyLogicalTypeAnnotation extends
LogicalTypeAnnotation {
+ private final String crs;
+ private final EdgeInterpolationAlgorithm edgeAlgorithm;
+
+ private GeographyLogicalTypeAnnotation(String crs,
EdgeInterpolationAlgorithm edgeAlgorithm) {
+ this.crs = crs;
+ this.edgeAlgorithm = edgeAlgorithm;
+ }
+
+ @Override
+ @Deprecated
+ public OriginalType toOriginalType() {
+ return null;
+ }
+
+ @Override
+ public <T> Optional<T> accept(LogicalTypeAnnotationVisitor<T>
logicalTypeAnnotationVisitor) {
+ return logicalTypeAnnotationVisitor.visit(this);
+ }
+
+ @Override
+ LogicalTypeToken getType() {
+ return LogicalTypeToken.GEOGRAPHY;
+ }
+
+ @Override
+ protected String typeParametersAsString() {
+ StringBuilder sb = new StringBuilder();
+ sb.append("(");
+ sb.append(",");
+ if (crs != null && !crs.isEmpty()) {
+ sb.append(",");
+ sb.append(crs);
+ }
+ if (edgeAlgorithm != null) {
+ sb.append(",");
+ sb.append(edgeAlgorithm);
+ }
+ sb.append(")");
+ return sb.toString();
+ }
+
+ public String getCrs() {
+ return crs;
+ }
+
+ public EdgeInterpolationAlgorithm getEdgeAlgorithm() {
+ return edgeAlgorithm;
+ }
+
+ @Override
+ public boolean equals(Object obj) {
+ if (!(obj instanceof GeographyLogicalTypeAnnotation)) {
+ return false;
+ }
+ GeographyLogicalTypeAnnotation other = (GeographyLogicalTypeAnnotation)
obj;
+ return crs != null
Review Comment:
Can we use `Objects.equals(crs, other.crs) && Objects.equals(edgeAlgorithm,
other.edgeAlgorithm)`?
##########
parquet-column/src/main/java/org/apache/parquet/schema/LogicalTypeAnnotation.java:
##########
@@ -1128,6 +1172,140 @@ public int hashCode() {
}
}
+ public static class GeometryLogicalTypeAnnotation extends
LogicalTypeAnnotation {
+ private final String crs;
+
+ private GeometryLogicalTypeAnnotation(String crs) {
+ this.crs = crs;
+ }
+
+ @Override
+ @Deprecated
+ public OriginalType toOriginalType() {
+ return null;
+ }
+
+ @Override
+ public <T> Optional<T> accept(LogicalTypeAnnotationVisitor<T>
logicalTypeAnnotationVisitor) {
+ return logicalTypeAnnotationVisitor.visit(this);
+ }
+
+ @Override
+ LogicalTypeToken getType() {
+ return LogicalTypeToken.GEOMETRY;
+ }
+
+ @Override
+ protected String typeParametersAsString() {
+ StringBuilder sb = new StringBuilder();
+ sb.append("(");
+ sb.append(",");
+ if (crs != null && !crs.isEmpty()) {
+ sb.append(",");
+ sb.append(crs);
+ }
+ sb.append(")");
+ return sb.toString();
+ }
+
+ public String getCrs() {
+ return crs;
+ }
+
+ @Override
+ public boolean equals(Object obj) {
+ if (!(obj instanceof GeometryLogicalTypeAnnotation)) {
+ return false;
+ }
+ GeometryLogicalTypeAnnotation other = (GeometryLogicalTypeAnnotation)
obj;
+ return Objects.equals(crs, other.crs);
+ }
+
+ @Override
+ public int hashCode() {
+ return Objects.hash(crs);
+ }
+
+ @Override
+ PrimitiveStringifier valueStringifier(PrimitiveType primitiveType) {
+ return PrimitiveStringifier.WKB_STRINGIFIER;
+ }
+ }
+
+ public static class GeographyLogicalTypeAnnotation extends
LogicalTypeAnnotation {
+ private final String crs;
+ private final EdgeInterpolationAlgorithm edgeAlgorithm;
+
+ private GeographyLogicalTypeAnnotation(String crs,
EdgeInterpolationAlgorithm edgeAlgorithm) {
+ this.crs = crs;
+ this.edgeAlgorithm = edgeAlgorithm;
+ }
+
+ @Override
+ @Deprecated
+ public OriginalType toOriginalType() {
+ return null;
+ }
+
+ @Override
+ public <T> Optional<T> accept(LogicalTypeAnnotationVisitor<T>
logicalTypeAnnotationVisitor) {
+ return logicalTypeAnnotationVisitor.visit(this);
+ }
+
+ @Override
+ LogicalTypeToken getType() {
+ return LogicalTypeToken.GEOGRAPHY;
+ }
+
+ @Override
+ protected String typeParametersAsString() {
+ StringBuilder sb = new StringBuilder();
+ sb.append("(");
+ sb.append(",");
+ if (crs != null && !crs.isEmpty()) {
+ sb.append(",");
+ sb.append(crs);
+ }
+ if (edgeAlgorithm != null) {
+ sb.append(",");
Review Comment:
We need to check the state of `crs` to decide if comma is required.
##########
parquet-hadoop/src/main/java/org/apache/parquet/format/converter/ParquetMetadataConverter.java:
##########
@@ -524,6 +530,31 @@ public Optional<LogicalType>
visit(LogicalTypeAnnotation.UnknownLogicalTypeAnnot
public Optional<LogicalType>
visit(LogicalTypeAnnotation.IntervalLogicalTypeAnnotation intervalLogicalType) {
return of(LogicalType.UNKNOWN(new NullType()));
}
+
+ @Override
+ public Optional<LogicalType>
visit(LogicalTypeAnnotation.GeometryLogicalTypeAnnotation geometryLogicalType) {
+ GeometryType geometryType = new GeometryType();
+ if (geometryLogicalType.getCrs() != null) {
+ geometryType.setCrs(geometryLogicalType.getCrs());
+ }
+ return of(LogicalType.GEOMETRY(geometryType));
+ }
+
+ @Override
+ public Optional<LogicalType>
visit(LogicalTypeAnnotation.GeographyLogicalTypeAnnotation
geographyLogicalType) {
+ GeographyType geographyType = new GeographyType();
+ if (geographyLogicalType.getCrs() != null) {
+ geographyType.setCrs(geographyLogicalType.getCrs());
+ }
+ if (geographyLogicalType.getEdgeAlgorithm() != null) {
+ EdgeInterpolationAlgorithm algorithm =
+
EdgeInterpolationAlgorithm.valueOf(String.valueOf(geographyLogicalType.getEdgeAlgorithm()));
Review Comment:
Why not directly using `geographyLogicalType.getEdgeAlgorithm())`? These
conversions seem unnecessary.
##########
parquet-column/src/main/java/org/apache/parquet/column/statistics/geometry/BoundingBox.java:
##########
@@ -0,0 +1,257 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.parquet.column.statistics.geometry;
+
+import org.apache.parquet.Preconditions;
+import org.locationtech.jts.geom.Coordinate;
+import org.locationtech.jts.geom.Envelope;
+import org.locationtech.jts.geom.Geometry;
+
+public class BoundingBox {
+
+ private double xMin = Double.NaN;
+ private double xMax = Double.NaN;
+ private double yMin = Double.NaN;
+ private double yMax = Double.NaN;
+ private double zMin = Double.NaN;
+ private double zMax = Double.NaN;
+ private double mMin = Double.NaN;
+ private double mMax = Double.NaN;
Review Comment:
> with the initial/empty state as Inf/-Inf, which simplified the update
methods)
I am inclined to move to `Inf/-Inf` to simplify the implementation. We would
need some helper methods like `isValid()`, `hasX()` to make users happy though.
##########
parquet-column/src/main/java/org/apache/parquet/schema/LogicalTypeAnnotation.java:
##########
@@ -147,6 +151,30 @@ protected LogicalTypeAnnotation fromString(List<String>
params) {
return float16Type();
}
},
+ GEOMETRY {
+ @Override
+ protected LogicalTypeAnnotation fromString(List<String> params) {
+ if (params.size() < 1) {
+ throw new RuntimeException(
+ "Expecting at least 1 parameter for geometry logical type, got "
+ params.size());
+ }
+ String crs = params.size() > 0 ? params.get(0) : null;
Review Comment:
```suggestion
String crs = params.size() > 0 ? params.get(0) : DEFAULT_CRS;
```
Should it be the default not null?
##########
parquet-column/src/main/java/org/apache/parquet/schema/LogicalTypeAnnotation.java:
##########
@@ -41,8 +41,12 @@
import java.util.Set;
import java.util.function.Supplier;
import org.apache.parquet.Preconditions;
+import
org.apache.parquet.column.statistics.geometry.EdgeInterpolationAlgorithm;
public abstract class LogicalTypeAnnotation {
+
+ public static final String DEFAULT_GEOMETRY_CRS = "OGC:CRS84";
Review Comment:
```suggestion
public static final String DEFAULT_CRS = "OGC:CRS84";
```
Now we have geography type so we need to avoid the confusion
##########
parquet-column/src/main/java/org/apache/parquet/schema/LogicalTypeAnnotation.java:
##########
@@ -1128,6 +1172,140 @@ public int hashCode() {
}
}
+ public static class GeometryLogicalTypeAnnotation extends
LogicalTypeAnnotation {
+ private final String crs;
+
+ private GeometryLogicalTypeAnnotation(String crs) {
+ this.crs = crs;
+ }
+
+ @Override
+ @Deprecated
+ public OriginalType toOriginalType() {
+ return null;
+ }
+
+ @Override
+ public <T> Optional<T> accept(LogicalTypeAnnotationVisitor<T>
logicalTypeAnnotationVisitor) {
+ return logicalTypeAnnotationVisitor.visit(this);
+ }
+
+ @Override
+ LogicalTypeToken getType() {
+ return LogicalTypeToken.GEOMETRY;
+ }
+
+ @Override
+ protected String typeParametersAsString() {
+ StringBuilder sb = new StringBuilder();
+ sb.append("(");
+ sb.append(",");
+ if (crs != null && !crs.isEmpty()) {
+ sb.append(",");
+ sb.append(crs);
+ }
+ sb.append(")");
+ return sb.toString();
+ }
+
+ public String getCrs() {
+ return crs;
+ }
+
+ @Override
+ public boolean equals(Object obj) {
+ if (!(obj instanceof GeometryLogicalTypeAnnotation)) {
+ return false;
+ }
+ GeometryLogicalTypeAnnotation other = (GeometryLogicalTypeAnnotation)
obj;
+ return Objects.equals(crs, other.crs);
+ }
+
+ @Override
+ public int hashCode() {
+ return Objects.hash(crs);
+ }
+
+ @Override
+ PrimitiveStringifier valueStringifier(PrimitiveType primitiveType) {
+ return PrimitiveStringifier.WKB_STRINGIFIER;
+ }
+ }
+
+ public static class GeographyLogicalTypeAnnotation extends
LogicalTypeAnnotation {
+ private final String crs;
+ private final EdgeInterpolationAlgorithm edgeAlgorithm;
+
+ private GeographyLogicalTypeAnnotation(String crs,
EdgeInterpolationAlgorithm edgeAlgorithm) {
+ this.crs = crs;
+ this.edgeAlgorithm = edgeAlgorithm;
+ }
+
+ @Override
+ @Deprecated
+ public OriginalType toOriginalType() {
+ return null;
+ }
+
+ @Override
+ public <T> Optional<T> accept(LogicalTypeAnnotationVisitor<T>
logicalTypeAnnotationVisitor) {
+ return logicalTypeAnnotationVisitor.visit(this);
+ }
+
+ @Override
+ LogicalTypeToken getType() {
+ return LogicalTypeToken.GEOGRAPHY;
+ }
+
+ @Override
+ protected String typeParametersAsString() {
+ StringBuilder sb = new StringBuilder();
+ sb.append("(");
+ sb.append(",");
Review Comment:
```suggestion
```
##########
parquet-column/src/main/java/org/apache/parquet/schema/LogicalTypeAnnotation.java:
##########
@@ -147,6 +151,30 @@ protected LogicalTypeAnnotation fromString(List<String>
params) {
return float16Type();
}
},
+ GEOMETRY {
+ @Override
+ protected LogicalTypeAnnotation fromString(List<String> params) {
+ if (params.size() < 1) {
Review Comment:
Do we actually need this check? We can accept a null crs though
##########
parquet-column/src/main/java/org/apache/parquet/schema/LogicalTypeAnnotation.java:
##########
@@ -1128,6 +1172,140 @@ public int hashCode() {
}
}
+ public static class GeometryLogicalTypeAnnotation extends
LogicalTypeAnnotation {
+ private final String crs;
+
+ private GeometryLogicalTypeAnnotation(String crs) {
+ this.crs = crs;
+ }
+
+ @Override
+ @Deprecated
+ public OriginalType toOriginalType() {
+ return null;
+ }
+
+ @Override
+ public <T> Optional<T> accept(LogicalTypeAnnotationVisitor<T>
logicalTypeAnnotationVisitor) {
+ return logicalTypeAnnotationVisitor.visit(this);
+ }
+
+ @Override
+ LogicalTypeToken getType() {
+ return LogicalTypeToken.GEOMETRY;
+ }
+
+ @Override
+ protected String typeParametersAsString() {
+ StringBuilder sb = new StringBuilder();
+ sb.append("(");
+ sb.append(",");
+ if (crs != null && !crs.isEmpty()) {
+ sb.append(",");
+ sb.append(crs);
+ }
+ sb.append(")");
Review Comment:
The parenthesis might not be needed if crs is not provided.
##########
parquet-column/src/main/java/org/apache/parquet/column/statistics/geometry/GeospatialTypes.java:
##########
@@ -0,0 +1,192 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.parquet.column.statistics.geometry;
+
+import java.util.HashSet;
+import java.util.Set;
+import java.util.stream.Collectors;
+import org.apache.parquet.Preconditions;
+import org.locationtech.jts.geom.Coordinate;
+import org.locationtech.jts.geom.Geometry;
+
+public class GeospatialTypes {
+
+ private static final int UNKNOWN_TYPE_ID = -1;
+ private Set<Integer> types = new HashSet<>();
+ private boolean valid = true;
+
+ public GeospatialTypes(Set<Integer> types) {
+ this.types = types;
+ }
+
+ public GeospatialTypes() {}
+
+ public Set<Integer> getTypes() {
+ return types;
+ }
+
+ void update(Geometry geometry) {
+ if (!valid) {
+ return;
+ }
+ int code = getGeometryTypeCode(geometry);
+ if (code != UNKNOWN_TYPE_ID) {
+ types.add(code);
+ } else {
+ valid = false;
+ types.clear();
+ }
+ }
+
+ public void merge(GeospatialTypes other) {
+ Preconditions.checkArgument(other != null, "Cannot merge with null
GeospatialTypes");
+ if (!valid) {
+ return;
+ }
+ if (!other.valid) {
+ valid = false;
+ types.clear();
+ return;
+ }
+ types.addAll(other.types);
+ }
+
+ public void reset() {
+ types.clear();
+ valid = true;
+ }
+
+ public void abort() {
+ valid = false;
+ types.clear();
+ }
+
+ public GeospatialTypes copy() {
+ return new GeospatialTypes(new HashSet<>(types));
+ }
+
+ @Override
+ public String toString() {
+ return "GeospatialTypes{" + "types="
+ + types.stream().map(this::typeIdToString).collect(Collectors.toSet())
+ '}';
+ }
+
+ private int getGeometryTypeId(Geometry geometry) {
+ switch (geometry.getGeometryType()) {
+ case Geometry.TYPENAME_POINT:
+ return 1;
+ case Geometry.TYPENAME_LINESTRING:
+ return 2;
+ case Geometry.TYPENAME_POLYGON:
+ return 3;
+ case Geometry.TYPENAME_MULTIPOINT:
+ return 4;
+ case Geometry.TYPENAME_MULTILINESTRING:
+ return 5;
+ case Geometry.TYPENAME_MULTIPOLYGON:
+ return 6;
+ case Geometry.TYPENAME_GEOMETRYCOLLECTION:
+ return 7;
+ default:
+ return UNKNOWN_TYPE_ID;
+ }
+ }
+
+ /**
+ * Geospatial type codes:
+ *
+ * | Type | XY | XYZ | XYM | XYZM |
+ * | :----------------- | :--- | :--- | :--- | :--: |
+ * | Point | 0001 | 1001 | 2001 | 3001 |
+ * | LineString | 0002 | 1002 | 2002 | 3002 |
+ * | Polygon | 0003 | 1003 | 2003 | 3003 |
+ * | MultiPoint | 0004 | 1004 | 2004 | 3004 |
+ * | MultiLineString | 0005 | 1005 | 2005 | 3005 |
+ * | MultiPolygon | 0006 | 1006 | 2006 | 3006 |
+ * | GeometryCollection | 0007 | 1007 | 2007 | 3007 |
+ *
+ * See
https://github.com/apache/parquet-format/blob/master/Geospatial.md#geospatial-types
+ */
+ private int getGeometryTypeCode(Geometry geometry) {
+ int typeId = getGeometryTypeId(geometry);
+ if (typeId == UNKNOWN_TYPE_ID) {
+ return UNKNOWN_TYPE_ID;
+ }
+ Coordinate[] coordinates = geometry.getCoordinates();
+ boolean hasZ = false;
+ boolean hasM = false;
+ for (Coordinate coordinate : coordinates) {
+ if (!Double.isNaN(coordinate.getZ())) {
Review Comment:
Do we need to check +/-Inf here?
##########
parquet-column/src/main/java/org/apache/parquet/column/statistics/geometry/GeospatialStatistics.java:
##########
@@ -0,0 +1,285 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.parquet.column.statistics.geometry;
+
+import org.apache.parquet.Preconditions;
+import org.apache.parquet.io.api.Binary;
+import org.apache.parquet.schema.LogicalTypeAnnotation;
+import org.apache.parquet.schema.PrimitiveType;
+import org.locationtech.jts.geom.Geometry;
+import org.locationtech.jts.io.ParseException;
+import org.locationtech.jts.io.WKBReader;
+
+/**
+ * A structure for capturing metadata for estimating the unencoded,
+ * uncompressed size of geospatial data written.
+ */
+public class GeospatialStatistics {
+
+ public static final String DEFAULT_GEOSPATIAL_STAT_CRS = "OGC:CRS84";
+
+ // Metadata that may impact the statistics calculation
+ private final String crs;
+ private final BoundingBox boundingBox;
+ private final EdgeInterpolationAlgorithm edgeAlgorithm;
+ private final GeospatialTypes geospatialTypes;
+
+ /**
+ * Whether the statistics has valid value.
+ *
+ * It is true by default. Only set to false while it fails to merge
statistics.
+ */
+ private boolean valid = true;
+
+ /**
+ * Merge the statistics from another GeospatialStatistics object.
+ *
+ * @param other the other GeospatialStatistics object
+ */
+ public void mergeStatistics(GeospatialStatistics other) {
+ if (!valid) return;
Review Comment:
`valid` is not well maintained in this method unfortunately.
##########
parquet-column/src/main/java/org/apache/parquet/schema/LogicalTypeAnnotation.java:
##########
@@ -1128,6 +1172,140 @@ public int hashCode() {
}
}
+ public static class GeometryLogicalTypeAnnotation extends
LogicalTypeAnnotation {
+ private final String crs;
+
+ private GeometryLogicalTypeAnnotation(String crs) {
+ this.crs = crs;
+ }
+
+ @Override
+ @Deprecated
+ public OriginalType toOriginalType() {
+ return null;
+ }
+
+ @Override
+ public <T> Optional<T> accept(LogicalTypeAnnotationVisitor<T>
logicalTypeAnnotationVisitor) {
+ return logicalTypeAnnotationVisitor.visit(this);
+ }
+
+ @Override
+ LogicalTypeToken getType() {
+ return LogicalTypeToken.GEOMETRY;
+ }
+
+ @Override
+ protected String typeParametersAsString() {
+ StringBuilder sb = new StringBuilder();
+ sb.append("(");
+ sb.append(",");
+ if (crs != null && !crs.isEmpty()) {
+ sb.append(",");
Review Comment:
```suggestion
```
##########
parquet-column/src/main/java/org/apache/parquet/column/statistics/geometry/GeospatialStatistics.java:
##########
@@ -0,0 +1,285 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.parquet.column.statistics.geometry;
+
+import org.apache.parquet.Preconditions;
+import org.apache.parquet.io.api.Binary;
+import org.apache.parquet.schema.LogicalTypeAnnotation;
+import org.apache.parquet.schema.PrimitiveType;
+import org.locationtech.jts.geom.Geometry;
+import org.locationtech.jts.io.ParseException;
+import org.locationtech.jts.io.WKBReader;
+
+/**
+ * A structure for capturing metadata for estimating the unencoded,
+ * uncompressed size of geospatial data written.
+ */
+public class GeospatialStatistics {
+
+ public static final String DEFAULT_GEOSPATIAL_STAT_CRS = "OGC:CRS84";
+
+ // Metadata that may impact the statistics calculation
+ private final String crs;
+ private final BoundingBox boundingBox;
+ private final EdgeInterpolationAlgorithm edgeAlgorithm;
+ private final GeospatialTypes geospatialTypes;
+
+ /**
+ * Whether the statistics has valid value.
+ *
+ * It is true by default. Only set to false while it fails to merge
statistics.
+ */
+ private boolean valid = true;
+
+ /**
+ * Merge the statistics from another GeospatialStatistics object.
+ *
+ * @param other the other GeospatialStatistics object
+ */
+ public void mergeStatistics(GeospatialStatistics other) {
+ if (!valid) return;
+
+ if (other == null) {
+ return;
+ }
+ if (this.boundingBox != null && other.boundingBox != null) {
+ this.boundingBox.merge(other.boundingBox);
+ }
+ if (this.geospatialTypes != null && other.geospatialTypes != null) {
+ this.geospatialTypes.merge(other.geospatialTypes);
+ }
+ }
+
+ /**
+ * Builder to create a GeospatialStatistics.
+ */
+ public static class Builder {
+ private final String crs;
Review Comment:
crs seems not used any more, should we remove it? Same for `crs` above and
the default value.
##########
parquet-column/src/main/java/org/apache/parquet/schema/LogicalTypeAnnotation.java:
##########
@@ -1128,6 +1172,140 @@ public int hashCode() {
}
}
+ public static class GeometryLogicalTypeAnnotation extends
LogicalTypeAnnotation {
+ private final String crs;
+
+ private GeometryLogicalTypeAnnotation(String crs) {
+ this.crs = crs;
+ }
+
+ @Override
+ @Deprecated
+ public OriginalType toOriginalType() {
+ return null;
+ }
+
+ @Override
+ public <T> Optional<T> accept(LogicalTypeAnnotationVisitor<T>
logicalTypeAnnotationVisitor) {
+ return logicalTypeAnnotationVisitor.visit(this);
+ }
+
+ @Override
+ LogicalTypeToken getType() {
+ return LogicalTypeToken.GEOMETRY;
+ }
+
+ @Override
+ protected String typeParametersAsString() {
+ StringBuilder sb = new StringBuilder();
+ sb.append("(");
+ sb.append(",");
+ if (crs != null && !crs.isEmpty()) {
+ sb.append(",");
+ sb.append(crs);
+ }
+ sb.append(")");
+ return sb.toString();
+ }
+
+ public String getCrs() {
+ return crs;
+ }
+
+ @Override
+ public boolean equals(Object obj) {
+ if (!(obj instanceof GeometryLogicalTypeAnnotation)) {
+ return false;
+ }
+ GeometryLogicalTypeAnnotation other = (GeometryLogicalTypeAnnotation)
obj;
+ return Objects.equals(crs, other.crs);
+ }
+
+ @Override
+ public int hashCode() {
+ return Objects.hash(crs);
+ }
+
+ @Override
+ PrimitiveStringifier valueStringifier(PrimitiveType primitiveType) {
+ return PrimitiveStringifier.WKB_STRINGIFIER;
+ }
+ }
+
+ public static class GeographyLogicalTypeAnnotation extends
LogicalTypeAnnotation {
+ private final String crs;
+ private final EdgeInterpolationAlgorithm edgeAlgorithm;
+
+ private GeographyLogicalTypeAnnotation(String crs,
EdgeInterpolationAlgorithm edgeAlgorithm) {
+ this.crs = crs;
+ this.edgeAlgorithm = edgeAlgorithm;
+ }
+
+ @Override
+ @Deprecated
+ public OriginalType toOriginalType() {
+ return null;
+ }
+
+ @Override
+ public <T> Optional<T> accept(LogicalTypeAnnotationVisitor<T>
logicalTypeAnnotationVisitor) {
+ return logicalTypeAnnotationVisitor.visit(this);
+ }
+
+ @Override
+ LogicalTypeToken getType() {
+ return LogicalTypeToken.GEOGRAPHY;
+ }
+
+ @Override
+ protected String typeParametersAsString() {
+ StringBuilder sb = new StringBuilder();
+ sb.append("(");
+ sb.append(",");
+ if (crs != null && !crs.isEmpty()) {
+ sb.append(",");
Review Comment:
```suggestion
```
##########
parquet-column/src/main/java/org/apache/parquet/column/statistics/geometry/GeospatialStatistics.java:
##########
@@ -0,0 +1,285 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.parquet.column.statistics.geometry;
+
+import org.apache.parquet.Preconditions;
+import org.apache.parquet.io.api.Binary;
+import org.apache.parquet.schema.LogicalTypeAnnotation;
+import org.apache.parquet.schema.PrimitiveType;
+import org.locationtech.jts.geom.Geometry;
+import org.locationtech.jts.io.ParseException;
+import org.locationtech.jts.io.WKBReader;
+
+/**
+ * A structure for capturing metadata for estimating the unencoded,
+ * uncompressed size of geospatial data written.
+ */
+public class GeospatialStatistics {
+
+ public static final String DEFAULT_GEOSPATIAL_STAT_CRS = "OGC:CRS84";
+
+ // Metadata that may impact the statistics calculation
+ private final String crs;
+ private final BoundingBox boundingBox;
+ private final EdgeInterpolationAlgorithm edgeAlgorithm;
+ private final GeospatialTypes geospatialTypes;
+
+ /**
+ * Whether the statistics has valid value.
+ *
+ * It is true by default. Only set to false while it fails to merge
statistics.
+ */
+ private boolean valid = true;
+
+ /**
+ * Merge the statistics from another GeospatialStatistics object.
+ *
+ * @param other the other GeospatialStatistics object
+ */
+ public void mergeStatistics(GeospatialStatistics other) {
+ if (!valid) return;
+
+ if (other == null) {
+ return;
+ }
+ if (this.boundingBox != null && other.boundingBox != null) {
+ this.boundingBox.merge(other.boundingBox);
+ }
+ if (this.geospatialTypes != null && other.geospatialTypes != null) {
+ this.geospatialTypes.merge(other.geospatialTypes);
+ }
+ }
+
+ /**
+ * Builder to create a GeospatialStatistics.
+ */
+ public static class Builder {
+ private final String crs;
+ private BoundingBox boundingBox;
+ private GeospatialTypes geospatialTypes;
+ private EdgeInterpolationAlgorithm edgeAlgorithm;
+ private final WKBReader reader = new WKBReader();
+
+ /**
+ * Create a builder to create a GeospatialStatistics.
+ * For Geometry type, edgeAlgorithm is not required.
+ *
+ * @param crs the coordinate reference system
+ */
+ public Builder(String crs) {
+ this.crs = crs;
+ this.boundingBox = new BoundingBox();
+ this.geospatialTypes = new GeospatialTypes();
+ this.edgeAlgorithm = null;
+ }
+
+ /**
+ * Create a builder to create a GeospatialStatistics.
+ * For Geography type, optional edgeAlgorithm can be set.
+ *
+ * @param crs the coordinate reference system
+ */
+ public Builder(String crs, EdgeInterpolationAlgorithm edgeAlgorithm) {
+ this.crs = crs;
+ this.boundingBox = new BoundingBox();
+ this.geospatialTypes = new GeospatialTypes();
+ this.edgeAlgorithm = edgeAlgorithm;
+ }
+
+ public void update(Binary value) {
+ if (value == null) {
+ return;
+ }
+ try {
+ Geometry geom = reader.read(value.getBytes());
+ update(geom);
+ } catch (ParseException e) {
+ abort();
+ }
+ }
+
+ private void update(Geometry geom) {
+ boundingBox.update(geom, crs);
+ geospatialTypes.update(geom);
+ }
+
+ public void abort() {
+ boundingBox.abort();
+ geospatialTypes.abort();
+ }
+
+ /**
+ * Build a GeospatialStatistics from the builder.
+ *
+ * @return a new GeospatialStatistics object
+ */
+ public GeospatialStatistics build() {
+ return new GeospatialStatistics(crs, boundingBox, geospatialTypes,
edgeAlgorithm);
+ }
+ }
+
+ /**
+ * Create a new GeospatialStatistics builder with the specified CRS.
+ *
+ * @param type the primitive type
+ * @return a new GeospatialStatistics builder
+ */
+ public static GeospatialStatistics.Builder newBuilder(PrimitiveType type) {
+ LogicalTypeAnnotation logicalTypeAnnotation =
type.getLogicalTypeAnnotation();
+ if (logicalTypeAnnotation instanceof
LogicalTypeAnnotation.GeometryLogicalTypeAnnotation) {
+ String crs = ((LogicalTypeAnnotation.GeometryLogicalTypeAnnotation)
logicalTypeAnnotation).getCrs();
+ return new GeospatialStatistics.Builder(crs);
+ } else if (logicalTypeAnnotation instanceof
LogicalTypeAnnotation.GeographyLogicalTypeAnnotation) {
+ String crs = ((LogicalTypeAnnotation.GeographyLogicalTypeAnnotation)
logicalTypeAnnotation).getCrs();
+ EdgeInterpolationAlgorithm edgeAlgorithm =
+ ((LogicalTypeAnnotation.GeographyLogicalTypeAnnotation)
logicalTypeAnnotation).getEdgeAlgorithm();
+ return new GeospatialStatistics.Builder(crs, edgeAlgorithm);
+ } else {
+ return noopBuilder();
+ }
+ }
+
+ /**
+ * Constructs a GeospatialStatistics object with the specified CRS, bounding
box, and geospatial types.
+ *
+ * @param crs the coordinate reference system
+ * @param boundingBox the bounding box for the geospatial data
+ * @param geospatialTypes the geospatial types
+ */
+ public GeospatialStatistics(
+ String crs,
+ BoundingBox boundingBox,
+ GeospatialTypes geospatialTypes,
+ EdgeInterpolationAlgorithm edgeAlgorithm) {
+ this.crs = crs;
+ this.boundingBox = boundingBox;
+ this.geospatialTypes = geospatialTypes;
+ this.edgeAlgorithm = edgeAlgorithm;
+ }
+
+ /**
+ * Constructs a GeospatialStatistics object with the specified CRS.
+ *
+ * @param crs the coordinate reference system
+ */
+ public GeospatialStatistics(String crs) {
+ this(crs, new BoundingBox(), new GeospatialTypes(), null);
+ }
+
+ /**
+ * Constructs a GeospatialStatistics object with the specified CRS and edge
interpolation algorithm.
+ *
+ * @param crs the coordinate reference system
+ * @param edgeAlgorithm the edge interpolation algorithm
+ */
+ public GeospatialStatistics(String crs, EdgeInterpolationAlgorithm
edgeAlgorithm) {
+ this.crs = crs;
+ this.boundingBox = new BoundingBox();
+ this.geospatialTypes = new GeospatialTypes();
+ this.edgeAlgorithm = edgeAlgorithm;
+ }
+
+ /** Returns the coordinate reference system. */
+ public BoundingBox getBoundingBox() {
+ return boundingBox;
+ }
+
+ /** Returns the geometry types. */
+ public GeospatialTypes getGeospatialTypes() {
+ return geospatialTypes;
+ }
+
+ /**
+ * @return whether the statistics has valid value.
+ */
+ public boolean isValid() {
+ return valid;
+ }
+
+ public void merge(GeospatialStatistics other) {
+ if (!valid) return;
Review Comment:
ditto, `valid` is not updated.
##########
parquet-column/src/test/java/org/apache/parquet/column/statistics/geometry/TestGeospatialStatistics.java:
##########
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.parquet.column.statistics.geometry;
+
+import org.apache.parquet.io.api.Binary;
+import org.apache.parquet.schema.LogicalTypeAnnotation;
+import org.apache.parquet.schema.PrimitiveType;
+import org.apache.parquet.schema.Types;
+import org.junit.Assert;
+import org.junit.Test;
+
+public class TestGeospatialStatistics {
+
+ @Test
+ public void testAddGeospatialData() {
+ PrimitiveType type = Types.optional(PrimitiveType.PrimitiveTypeName.BINARY)
+ .as(LogicalTypeAnnotation.geometryType())
+ .named("a");
+ GeospatialStatistics.Builder builder =
GeospatialStatistics.newBuilder(type);
+ builder.update(Binary.fromString("POINT (1 1)"));
+ builder.update(Binary.fromString("POINT (2 2)"));
+ GeospatialStatistics statistics = builder.build();
+ Assert.assertTrue(statistics.isValid());
+ Assert.assertNotNull(statistics.getBoundingBox());
+ Assert.assertNotNull(statistics.getGeospatialTypes());
+ }
+
+ @Test
+ public void testMergeGeospatialStatistics() {
Review Comment:
We need more test cases for invalid stats after merge.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]