Re: spec question on equality deletes

2024-04-17 Thread Manu Zhang
+1 on defining it clearly in the spec. Note the “spec doc” is the spec itself, which requires more accurate description than doc. We may also need spec test to check whether compute engine conforms to spec, not the other way around. Yufei Gu 于2024年4月17日 周三01:08写道: > For me, (b) is the right behav

Re: spec question on equality deletes

2024-04-16 Thread Yufei Gu
For me, (b) is the right behavior, we may just be clearer in the spec doc, but open for suggestions in case I missed something. Yufei On Mon, Apr 15, 2024 at 11:02 PM Renjie Liu wrote: > Hi, Wing: > > I totally agree that we should clearly define the expected behavior in > spec. I lean towards

Re: spec question on equality deletes

2024-04-15 Thread Renjie Liu
Hi, Wing: I totally agree that we should clearly define the expected behavior in spec. I lean towards a), e.g. the row should be completed ignored or completed same as original row, intermediate state should be defined as invalid. On Tue, Apr 16, 2024 at 8:40 AM Wing Yew Poon wrote: > Hi Yufei,

Re: spec question on equality deletes

2024-04-15 Thread Wing Yew Poon
Hi Yufei, Thank you for your response. It sounds like on 2, your thinking is that (b) is the correct behavior. Indeed, I have tried it out with Spark and afaict, it does (b). However, that does not mean that it is the correct behavior. The spec should clearly define it. - Wing Yew On Mon, Apr 15,

Re: spec question on equality deletes

2024-04-15 Thread Wing Yew Poon
rune deletion files, then inconsistent > column data may not affect the result. But in general it should be > considered as incorrect data. > > > > *From: *Wing Yew Poon > *Date: *Saturday, April 13, 2024 at 02:16 > *To: *dev@iceberg.apache.org > *Subject: *spec question

Re: spec question on equality deletes

2024-04-15 Thread Yufei Gu
Hi Wing Yew Poon, Here is my understanding, but not necessarily how an engine implements it. It should only consider the columns in equality_ids when we apply eq deletes. Also the engine should ignore the unrelated columns. It will still delete the row with id 3 in the following case you described

Re: spec question on equality deletes

2024-04-13 Thread Renjie Liu
ute engine works. If the compute engine doesn’t try to prune deletion files, then inconsistent column data may not affect the result. But in general it should be considered as incorrect data. From: Wing Yew Poon Date: Saturday, April 13, 2024 at 02:16 To: dev@iceberg.apache.org Subject: spec q

spec question on equality deletes

2024-04-12 Thread Wing Yew Poon
Hi, I have some questions on the current Iceberg spec regarding equality deletes: https://iceberg.apache.org/spec/#equality-delete-files The spec says that for "a table with the following data: 1: id | 2: category | 3: name