Yeah, this is a bug. You should be able to define multiple partition functions on the same field. But we do want to check that multiple time partitions are not used because they are redundant. I'll open a PR. Thanks for pointing this out!
On Tue, May 28, 2019 at 4:15 AM Anton Okolnychyi <aokolnyc...@apple.com.invalid> wrote: > Hm, this is actually a good question. > > My understanding is that we shouldn't explicitly define partitioning by > year/month/day/hour on the same column. Instead, we should be fine with > hour only. Iceberg produces ordinals for time-based partition functions. As > far as I remember, Ryan was planning to submit a PR in order to prohibit > multiple partition functions. > > I believe in the above case you are trying to create one partition spec > with multiple partition functions on the same field. > > Keep in mind that if you partition by hour only, the directory structure > won’t contain year/month/day folders. If you are to have that directory > structure, you need to have actual columns for year/month/day in your > dataset and use identity partition function. > > Thanks, > Anton > > > > On 28 May 2019, at 09:27, filip <filip....@gmail.com> wrote: > > > > > > A while back I bumped into an issue with what seems to be an > inconsistency in the partition spec API or maybe it's just an > implementation bug. > > Attempting to have multiple partitions specs on the same schema field I > bumped into an issue regarding the fact that while the API allows for > multiple partitions spec defined for same field, internally this conflicts > with the assumption that there is only one partition spec per field. > > > > Given this partition spec: > > > > PartitionSpec spec = PartitionSpec.builderFor(schema) > > .withSpecId(0) > > .year("timestamp") > > .month("timestamp") > > .day("timestamp") > > .hour("timestamp") > > .build(); > > > > Trying to validate partition pruning with similar code to: > > > > UnboundPredicate<Object> match = Expressions.equal("timestamp", > > > > Literal.of("2019-01-11T00:00:00.000000").to(TimestampType.withoutZone()).value()); > > Assert.assertTrue( > > new InclusiveManifestEvaluator(spec, > match).eval(table.currentSnapshot().manifests().get(0)); > > > > I get an unexpected google collection exception: > > > > java.lang.IllegalArgumentException: Multiple entries with same key: > 1=org.apache.iceberg.PartitionField@da8cdda7 and > 1=org.apache.iceberg.PartitionField@e5c6fddb > > > > at > com.google.common.collect.ImmutableMap.conflictException(ImmutableMap.java:215) > > at > com.google.common.collect.ImmutableMap.checkNoConflict(ImmutableMap.java:209) > > at > com.google.common.collect.RegularImmutableMap.checkNoConflictInKeyBucket(RegularImmutableMap.java:147) > > at > com.google.common.collect.RegularImmutableMap.fromEntryArray(RegularImmutableMap.java:110) > > at > com.google.common.collect.ImmutableMap$Builder.build(ImmutableMap.java:393) > > at > org.apache.iceberg.PartitionSpec.lazyFieldsBySourceId(PartitionSpec.java:232) > > at > org.apache.iceberg.PartitionSpec.getFieldBySourceId(PartitionSpec.java:95) > > at > org.apache.iceberg.expressions.Projections$InclusiveProjection.predicate(Projections.java:208) > > at > org.apache.iceberg.expressions.Projections$InclusiveProjection.predicate(Projections.java:200) > > at > org.apache.iceberg.expressions.Projections$BaseProjectionEvaluator.predicate(Projections.java:185) > > at > org.apache.iceberg.expressions.Projections$BaseProjectionEvaluator.predicate(Projections.java:136) > > at > org.apache.iceberg.expressions.ExpressionVisitors.visit(ExpressionVisitors.java:152) > > at > org.apache.iceberg.expressions.Projections$BaseProjectionEvaluator.project(Projections.java:152) > > at > org.apache.iceberg.expressions.InclusiveManifestEvaluator.<init>(InclusiveManifestEvaluator.java:63) > > at > org.apache.iceberg.expressions.InclusiveManifestEvaluator.<init>(InclusiveManifestEvaluator.java:56) > > at > org.apache.iceberg.TestScansAndSchemaEvolution.testMultiPartitionPerFieldTransform(TestScansAndSchemaEvolution.java:177) > > > > > > I was wondering if this issue is tracked so maybe I could help out. > > > > Thanks, > > /Filip > > -- Ryan Blue Software Engineer Netflix