Hi, I believe it only matters if you have conflicting commits. For single writer case, I think you are right and it should not matter, so you may save very slightly in performance by turning it to Snapshot Isolation. The checks are metadata checks though, so I would think it will not be a signfiicant performance difference.
In general, the isolation levels in Iceberg work by checking before commit to see if there are any conflicting changes to data files about to be committed, from when the operation first started (ie, starting snapshot id). So if there is a failure due to the isolation level, I believe the error bubbles back the application to try again, hence ‘pessimistic’. Note, metadata conflicts are automatically retried and should rarely bubble up to user, so only in case of data isolation level conflict (ie, you delete a file that is currently being rewritten by another operation), will error-handling be required. Hope that helps Szehon > On May 4, 2023, at 12:19 PM, Nirav Patel <nira...@gmail.com> wrote: > > I am trying to ingest data into iceberg table using spark streaming. There > are no multiple writers to same data at the moment. According to iceberg api > <https://iceberg.apache.org/javadoc/0.11.0/org/apache/iceberg/IsolationLevel.html#:%7E:text=Both%20of%20them%20provide%20a,environments%20with%20many%20concurrent%20writers.> > default isolation level for table is serializable . I want to understand if > there is only a single application (single spark streaming job in my case) > writing to iceberg table is there any advantage or disadvantage over using > serializable or a snapshot isolation ? Is there any performance impact of > using serializable when only one application is writing to table? Also it > seems iceberg allows all writers to write into snapshot and use OCC to decide > if one needs to retry because it was late. In this case how it is > serializable at all? isn't serilizability achieved via pessimistic > concurrency control? Would like to understand how iceberg implement > serializable isolation level and how it is different than snapshot isolation ? > > Thanks