+1 -- comparing decoded values is a great fit for value-correctness cases (variant, deletes, nested), but I am not sure if it would catch type-level divergence like equality_ids int-vs-long (apache/iceberg-go#880), since the values decode equal and a plain-JSON expected can't even express "int, not long" — so it may be worth scoping v1 around value-correctness first and bringing the type-level cases in with a typed expected or a write-side check (along the lines Tanmay described).
Thanks Xin On Tue, Jun 30, 2026 at 9:32 AM Jones, Danny <[email protected]> wrote: > +1, I’m broadly aligned with this proposal. I think having a reference > physical artifact to then compare against is valuable. > > > > My team have been working on a few sets of tests that are of a similar > nature. The motivation for us has been correctness of maintenance > operations. I’ll share a bit of info here about these, in case its relevant > and to understand how they may complement this proposal. > > > > First set tackle the question: is the compaction/replace operation > resulting in the same logical data? We generate a table using Spark and > then run some compaction operation (using a similar runner harness API like > is proposed in this doc, i.e. “please compact table X”, could be PySpark, > could be an API call to some maintenance service). Afterwards, we run a few > common query engines (Spark, DuckDB, etc.) and verify that they agree with > respect to an order-independent checksum, row count, NDVs; plus, that they > return the same presence or absent for sampled rows to exercise the > metadata path. > > > > Second, we have a table builder that, using the small model property > (formal methods), is building an exhaustive set of table layouts which we > can use as input to the above tests. For example, one test case is 3 > physical rows across 2 data files; 2 rows are deleted across 2 positional > delete files. > > > > Third, we have a validator we’ve written in Rust that just loads a > metadata DAG (the JSON, manifest lists, manifests) and validates a bunch of > invariants – i.e. data/file sequence numbers in manifest entries are > optional ONLY for Added status, sequence numbers are non-negative; for the > replace operation: table-uuid is immutable, schema fields don’t change, > etc.. I think it’s a little imperfect since it relies on iceberg-rust which > swaps some things under the hood very helpfully but means we aren’t testing > the actual physical artifacts. > > > > Happy to chat more about some of these – the first two I’ve been working > with a colleague on getting in a shape to publish on GitHub. > > > > Danny > > > > On 2026/06/29 18:40:43 Tanmay Rauth wrote: > > > Thanks Neelesh, the doc lays this out really well, and +1 to Matt. The > > > framing I'd most want to underline is one you already make: the hardest > > > cases aren't bugs, they're where two implementations both follow the > spec > > > faithfully and still disagree. The day-transform field type > (iceberg#16414) > > > is a good example, and I think your point that writing down the expected > > > value is what forces the ambiguity to get resolved is one of the > strongest > > > motivations for the proposal. That's something per-implementation CI can > > > never do on its own. > > > > > > A couple of small things, for whatever they're worth: > > > > > > - The decoded-values-not-bytes approach feels right. The day-transform > case > > > (iceberg#16414) is exactly where a byte-level comparison would flag two > > > valid encodings as different, while a value comparison correctly treats > > > them as equivalent. > > > - On the open question of reads-only vs. reads+writes: iceberg-go#880 > > > actually originated on the write side (Go wrote equality_ids as long, and > > > Java failed when reading it). It might be worth structuring each fixture > > > as input -> golden file -> expected value from the start. The same > fixture > > > can then exercise both directions: read tests verify that the golden file > > > decodes to the expected value, while write tests verify that an > > > implementation produces a conforming golden file. That avoids having to > > > re-author fixtures when write conformance is added later. > > > > > > Thanks for putting this together. > > > > > > Regards, > > > Tanmay Rauth > > > > > > On Mon, Jun 29, 2026 at 9:57 AM Sung Yun <[email protected]> wrote: > > > > > > > +1, thanks Neelesh. Linking my parallel thread and doc for anyone who > > > > wants the detail [1]. > > > > > > > > Having read your write-up, I think the two are substantially the same > > > > proposal, with just narrow difference around proposed repo layout and > the > > > > integration plan. I think it's a great sign that there's already a > great > > > > amount of overlap in our thoughts. I agree that a community sync sounds > > > > worthwhile, and it would also be useful to converge the two docs in > > > > parallel so we bring one proposal back here for review and convergence > > > > through lazy consensus. > > > > > > > > A few areas from my version/poc [2] I think are worth folding in as > points > > > > to discuss and converge on: > > > > > > > > - Contribution/README guides for adding and reviewing fixtures. > > > > - A submodule-based integration pattern, with each implementation > pinning > > > > the fixture repo to a commit. > > > > - How each test surface is meant to be consumed and integrated by the > > > > individual implementations in their CI > > > > > > > > Sung > > > > > > > > [1] https://lists.apache.org/thread/964630c6q0jovs579x1jzb1t0o19jgjg > <https://urldefense.com/v3/__https://lists.apache.org/thread/964630c6q0jovs579x1jzb1t0o19jgjg__;!!LIr3w8kk_Xxm!qm880wesBlCHcwj4dISpkCyVK_VJhZuu9EoyoGsIi4EU37M3fCLcRhzLsO7qa-oahYM5tXDl5BJxgJLlJ0MMGZoNeQ$> > > > > [2] https://github.com/sungwy/iceberg-testing/pull/1 > <https://urldefense.com/v3/__https://github.com/sungwy/iceberg-testing/pull/1__;!!LIr3w8kk_Xxm!qm880wesBlCHcwj4dISpkCyVK_VJhZuu9EoyoGsIi4EU37M3fCLcRhzLsO7qa-oahYM5tXDl5BJxgJLlJ0N-ZK5Z-w$> > > > > > > > > On 2026/06/29 16:47:18 Neelesh Salian wrote: > > > > > Thanks Matt. Seems like there is interest in doing this. > > > > > Separately, Sung has a similar proposal in the community and we are > > > > > connected offline to sync and converge since the proposals are along > > > > > similar lines. > > > > > Will update this thread as we discuss. > > > > > If there are more folks interested in this, it might be worth doing a > > > > > community on-off sync to brainstorm this as well. > > > > > > > > > > On Mon, Jun 29, 2026 at 8:30 AM Matt Topol <[email protected]> > > > > wrote: > > > > > > > > > > > Thanks for the proposal! I'm gonna read through this, but I just > > > > wanted to > > > > > > chime in that this is something I've been desiring and hoping for > for a > > > > > > long time. We've encountered tons of cases during the development > of > > > > > > iceberg-go where implementations diverged while still following the > > > > letter > > > > > > of the spec. This kind of testing is very much needed. > > > > > > > > > > > > --Matt > > > > > > > > > > > > On Mon, Jun 29, 2026, 11:11 AM Neelesh Salian < > > > > [email protected]> > > > > > > wrote: > > > > > > > > > > > >> Hi all, > > > > > >> > > > > > >> Each Iceberg implementation has its own tests, but there isn't a > > > > shared > > > > > >> way to check that > > > > > >> a table written by one is read the same way by another. > > > > > >> A few examples that have come up across the implementations: a > > > > manifest > > > > > >> written by one client that another can't read, > > > > > >> a metadata.json one writer produces that another rejects because > they > > > > > >> disagree on whether a field is required, and a partition transform > > > > that > > > > > >> ends up encoded more than one way across implementations. Some of > > > > these > > > > > >> turned out to be bugs, others places where the spec is ambiguous. > > > > > >> > > > > > >> We think this is worth solving with some form of shared > > > > > >> cross-implementation conformance testing, and we'd like to align > as a > > > > > >> community on whether to take it on and how best to start. We've > > > > written up > > > > > >> our current thinking, a possible direction, and a small prototype > in > > > > the > > > > > >> doc below. > > > > > >> > > > > > >> Details, a repo design, and the interop failures we've collected: > > > > > >> > > > > > https://docs.google.com/document/d/1HRcUMcrqUjo4CjGdwAIw85f7miWOGJ4ZJ90AgHbahaw/edit?usp=sharing > <https://urldefense.com/v3/__https://docs.google.com/document/d/1HRcUMcrqUjo4CjGdwAIw85f7miWOGJ4ZJ90AgHbahaw/edit?usp=sharing__;!!LIr3w8kk_Xxm!qm880wesBlCHcwj4dISpkCyVK_VJhZuu9EoyoGsIi4EU37M3fCLcRhzLsO7qa-oahYM5tXDl5BJxgJLlJ0MwXoHPXQ$> > > > > > >> > > > > > >> > > > > > >> Feedback welcome on whether this is worth doing and how we might > get > > > > > >> started. > > > > > >> > > > > > >> Thanks, > > > > > >> Neelesh (with Andrei Tserakhau) > > > > > >> > > > > > > > > > > > > > > > > > > >
