Hi all,

Each Iceberg implementation has its own tests, but there isn't a shared way
to check that
a table written by one is read the same way by another.
A few examples that have come up across the implementations: a manifest
written by one client that another can't read,
a metadata.json one writer produces that another rejects because they
disagree on whether a field is required, and a partition transform that
ends up encoded more than one way across implementations. Some of these
turned out to be bugs, others places where the spec is ambiguous.

We think this is worth solving with some form of shared
cross-implementation conformance testing, and we'd like to align as a
community on whether to take it on and how best to start. We've written up
our current thinking, a possible direction, and a small prototype in the
doc below.

Details, a repo design, and the interop failures we've collected:
https://docs.google.com/document/d/1HRcUMcrqUjo4CjGdwAIw85f7miWOGJ4ZJ90AgHbahaw/edit?usp=sharing


Feedback welcome on whether this is worth doing and how we might get
started.

Thanks,
Neelesh (with Andrei Tserakhau)

Reply via email to