Re: Proposal - Priority based commit ordering on partitions

Prashant Singh Thu, 13 Oct 2022 05:31:13 -0700

Hello Everyone,

Finding slightly hard to find slots where most of the folks across time
zones are available / comfortable join, will hold back till then and will
keep this thread updated.
Apologies for any inconvenience.


Regards,
Prashant Singh

On Thu, Oct 13, 2022 at 5:57 PM Prashant Singh <prashant010...@gmail.com>
wrote:

> Thanks for the feedbacks Péter,
>
> > Do I understand correctly that the main issue is that the concurrent
> compactions and writes (with deletes/updates) cause conflicts?
>
> Yes, the main issue we are trying to solve is the conflicts happening
> between maintenance processes and other writes.
>
> Regarding the hive approach, you suggested, As you already pointed out the
> drawbacks of having the above approach and the major one being spec change
> and performance penalty, we wanted to avoid that.
>
> In the proposed approach, we wanted to utilize existing functionalities to
> achieve this, and storing this extra info, if the writers had this info
> (process running against the partition) it could benefit from it, otherwise
> would continue working as they were without this change.
>
> Regards,
> Prashant Singh
>
> On Fri, Oct 7, 2022 at 10:39 AM Péter Váry <peter.vary.apa...@gmail.com>
> wrote:
>
>> One more thing:
>> - In Hive, we order the data files based on the original fileName and
>> rowNum. This helps when reading the delete files, as we do not need to keep
>> delete file data in memory. Iceberg tables could be sorted, so we either
>> has to keep the already read delete data in memory, or reread the delete
>> files when the order of the rows are changed.
>>
>> I do not think we would like to sacrifice query performance for table
>> maintenance.
>>
>> On Thu, Oct 6, 2022, 22:26 Prashant Singh <prashant010...@gmail.com>
>> wrote:
>>
>>> Hello all,
>>>
>>> I was OOO, just saw the mail.
>>>
>>> Thanks Ryan and Peter for the feedback, will address it and update the
>>> doc accordingly.
>>>
>>> As some of us are not available in the proposed slot, and also need some
>>> time to address the feedback.
>>>
>>> Will move this meeting to next week, and propose some slots accordingly
>>> (will reach out to interested folks via slack as well to get the slots).
>>> Apologies for any inconvenience.
>>>
>>> Regards,
>>> Prashant Singh
>>>
>>> On Wed, Oct 5, 2022 at 11:17 AM Péter Váry <peter.vary.apa...@gmail.com>
>>> wrote:
>>>
>>>> Do I understand correctly that the main issue is that the concurrent
>>>> compactions and writes (with deletes/updates) cause conflicts?
>>>>
>>>> In Hive we do compactions in different way, by storing the original
>>>> fileName/rowId (translated to Iceberg concepts) in the compacted files.
>>>> This way when a concurrent delete comes in later, any follow-up query still
>>>> can find the corresponding delete and omit the row in the result.
>>>>
>>>> The big advantage of this approach is that compactions can happen in
>>>> the background without any interference with the concurrent queries.
>>>>
>>>> There several drawbacks:
>>>> - This would be a table format change!
>>>> - Readding a file with the same fileName will become even more
>>>> problematic
>>>> - Size of the files will grow
>>>> - Queries become somewhat more complex as we need to implement
>>>> different delete file lookup for compacted files
>>>>
>>>> Do we see this issue important enough to merit the above charges/added
>>>> complexities?
>>>>
>>>> Thanks,
>>>> Peter
>>>>
>>>>
>>>> On Wed, Oct 5, 2022, 01:21 Ryan Blue <b...@tabular.io> wrote:
>>>>
>>>>> I won't be able to make it to the discussion, so I wanted to share a
>>>>> few thoughts here ahead of time.
>>>>>
>>>>> I'm fairly skeptical that this is the right approach. A locking scheme
>>>>> that requires participation is going to require a significant change to 
>>>>> the
>>>>> way we think about concurrency. And a locking scheme that is at the
>>>>> partition granularity is going to be difficult to set up.
>>>>>
>>>>> Also, I don't think that the design doc covers these issues in enough
>>>>> detail. I think there are some gaps with significant questions, like how 
>>>>> to
>>>>> proceed when a lock check has been done, but another process with higher
>>>>> priority comes in. It seems like even ignoring the partition granularity
>>>>> problem and assuming that we have writers that all participate, combining
>>>>> priority with locking creates a situation where a process can think it
>>>>> holds the lock but does not because another process preempted it.
>>>>>
>>>>> I think some of these could be resolved by making this locking scheme
>>>>> informational but still using the existing method to handle concurrency.
>>>>> But does that actually fix the problem?
>>>>>
>>>>> Ryan
>>>>>
>>>>> On Mon, Oct 3, 2022 at 12:56 PM Prashant Singh <
>>>>> prashant010...@gmail.com> wrote:
>>>>>
>>>>>> Thanks Wing,
>>>>>>
>>>>>> Great to have you onboard, really appreciate your feedback so far on
>>>>>> the proposal. Looking forward to more in the discussion.
>>>>>>
>>>>>> Regards,
>>>>>> Prashant
>>>>>>
>>>>>> On Tue, Oct 4, 2022 at 12:45 AM Wing Yew Poon
>>>>>> <wyp...@cloudera.com.invalid> wrote:
>>>>>>
>>>>>>> Prashant, just saw Jack's post mentioning that you're in India Time.
>>>>>>> Obviously day time Pacific is not convenient for you. I'm fine with 9 pm
>>>>>>> Pacific.
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Oct 3, 2022 at 12:09 PM Wing Yew Poon <wyp...@cloudera.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Prashant,
>>>>>>>> I am very interested in this proposal and would like to attend this
>>>>>>>> meeting. Friday October 7 is fine with me; I can do 9 pm Pacific Time 
>>>>>>>> if
>>>>>>>> that is what works for you (I don't know what time zone you're in),
>>>>>>>> although any time between 2 and 6 pm would be more convenient.
>>>>>>>> Thanks,
>>>>>>>> Wing Yew
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Oct 3, 2022 at 11:58 AM Prashant Singh <
>>>>>>>> prashant010...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Thanks Ryan,
>>>>>>>>>
>>>>>>>>> Should I go ahead and schedule this somewhere around 10/7 9:00 PM
>>>>>>>>> PST, will it work ?
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Prashant Singh
>>>>>>>>>
>>>>>>>>> On Fri, Sep 30, 2022 at 9:21 PM Ryan Blue <b...@tabular.io> wrote:
>>>>>>>>>
>>>>>>>>>> Prashant, great to see the PR for rollback on conflict! I'll take
>>>>>>>>>> a look at that one. Friday 10/7 after 1:30 PM works for me. Looking 
>>>>>>>>>> forward
>>>>>>>>>> to the discussion!
>>>>>>>>>>
>>>>>>>>>> On Fri, Sep 30, 2022 at 6:38 AM Prashant Singh <
>>>>>>>>>> prashant010...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hello folks,
>>>>>>>>>>>
>>>>>>>>>>> I was planning to host a discussion on this proposal
>>>>>>>>>>> <https://docs.google.com/document/d/1pSqxf5A59J062j9VFF5rcCpbW9vdTbBKTmjps80D-B0/edit>
>>>>>>>>>>> somewhere around late next week.
>>>>>>>>>>>
>>>>>>>>>>> Please let me know your availability if you are interested in
>>>>>>>>>>> attending the same, will schedule the meeting (online) accordingly.
>>>>>>>>>>>
>>>>>>>>>>> Meanwhile I have a PR
>>>>>>>>>>> <https://github.com/apache/iceberg/pull/5888> out as well, to
>>>>>>>>>>> rollback compaction on conflict detection (an approach that came up 
>>>>>>>>>>> as an
>>>>>>>>>>> alternative to the proposal in sync). Appreciate your feedback here 
>>>>>>>>>>> as well.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Prashant Singh
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Aug 17, 2022 at 6:25 PM Prashant Singh <
>>>>>>>>>>> prashant010...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hello all,
>>>>>>>>>>>>
>>>>>>>>>>>> We have been working on a proposal [link
>>>>>>>>>>>> <https://docs.google.com/document/d/1pSqxf5A59J062j9VFF5rcCpbW9vdTbBKTmjps80D-B0/edit#>]
>>>>>>>>>>>> to determine the precedence between two or more concurrently 
>>>>>>>>>>>> running jobs,
>>>>>>>>>>>> in case of conflicts.
>>>>>>>>>>>>
>>>>>>>>>>>> Please take some time to review the proposal.
>>>>>>>>>>>>
>>>>>>>>>>>> We would appreciate any feedback on this from the community!
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Prashant Singh
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Ryan Blue
>>>>>>>>>> Tabular
>>>>>>>>>>
>>>>>>>>>
>>>>>
>>>>> --
>>>>> Ryan Blue
>>>>> Tabular
>>>>>
>>>>

Re: Proposal - Priority based commit ordering on partitions

Reply via email to