Hi,

I was wondering if someone could give me some pointers on this line: https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/avro/AvroIO.java#L53

Some threads keep getting stuck trying to read the manifest list on commit. From debug logs, it looks like we're reading the avro file multiple times?

From a stack trace, it looks like we're using the AvroInputStreamAdapter. Our metadata table io should be a HadoopFileIO (with s3a). Our classpath has all the specified files but org.apache.avro.file.SeekableInput isn't changed.

Should it be relocated? Should we be trying to use the AvroFSInput? Is the boolean just named wrong?

(stack trace):
"dse-ingester-system-akka.actor.default-dispatcher-5" #39 prio=5 os_prio=0 tid=0x00007fbbec00b000 nid=0x40 runnable [0x00007fbc4c52b000]
   java.lang.Thread.State: RUNNABLE
    at org.apache.iceberg.avro.AvroIO$AvroInputStreamAdapter.read(AvroIO.java:120)     at org.apache.avro.file.DataFileReader.openReader(DataFileReader.java:61)     at org.apache.iceberg.avro.AvroIterable.newFileReader(AvroIterable.java:100)
    at org.apache.iceberg.avro.AvroIterable.iterator(AvroIterable.java:77)
    at org.apache.iceberg.avro.AvroIterable.iterator(AvroIterable.java:37)
    at org.apache.iceberg.relocated.com.google.common.collect.Iterables.addAll(Iterables.java:320)     at org.apache.iceberg.relocated.com.google.common.collect.Lists.newLinkedList(Lists.java:237)
    at org.apache.iceberg.ManifestLists.read(ManifestLists.java:46)
    at org.apache.iceberg.BaseSnapshot.cacheManifests(BaseSnapshot.java:127)
    at org.apache.iceberg.BaseSnapshot.allManifests(BaseSnapshot.java:141)
    at org.apache.iceberg.FastAppend.apply(FastAppend.java:142)
    at org.apache.iceberg.SnapshotProducer.apply(SnapshotProducer.java:149)
    at org.apache.iceberg.SnapshotProducer.lambda$commit$2(SnapshotProducer.java:262)     at org.apache.iceberg.SnapshotProducer$$Lambda$1641/2145060236.run(Unknown Source)     at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:404)     at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:213)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:197)
    at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:189)
    at org.apache.iceberg.SnapshotProducer.commit(SnapshotProducer.java:261)     at com.box.dataplatform.iceberg.client.DPIcebergClientImpl.commitBatch(DPIcebergClientImpl.java:85)
...
   Locked ownable synchronizers:
    - None


Thanks,

John

Reply via email to