> > It’s not at all clear why unique keys would be needed at all.
If we turn your questions around, you answer yourself. If you have independent writers, you need unique keys. Also truly independent writers (like a job writing while a job compacts), > means effectively a distributed transaction, and I believe it’s clearly out > of scope for Iceberg to solve that ? > Assuming a single process is writing seems severely limiting in design and scale. I'm also surprised that you would think this is outside of Iceberg's scope. A table format that can only be modified by a single process basically locks that format into a single tool for a particular deployment. Uniqueness - enforcing uniqueness at scale is not feasible (proovably so). Expecting uniqueness is different than enforcing it. If you're saying it is impossible to enforce, I understand that. If your we can't define a system where it is expected and there are ramifications if it is not maintained. Also, at scale, it’s really only feasible to do query and update/upsert on > the partition/bucket/sort key, any other access is likely a full scan of > terabytes of data, on remote storage. I'm not sure why you would say unless you assume a particular implementation. Single record deletion is definitely an important use case. There is no need to do a full table scan to accomplish that unless you're assuming an eager approach to deletion. I do continue to wonder how much of this back and forth is the mixing of thinking around restatement (eager) versus delta (lazy) implementations. Maybe we should separate them out as two different conversations?