@Ryan, the JSON serialization is also used by Flink for checkpoint state.
so it is not purely a REST API thing.

@Jack, Ryan also had the same suggestion in the PR comment. I have updated
the naming

On Wed, Feb 14, 2024 at 4:08 PM Jack Ye <yezhao...@gmail.com> wrote:

> > It would fail if the FileScanTask is some other implementation (like
> StaticDataTask).
> Actually we faced exactly the same issue, and we have an internal patch to
> fix the parser for that. +1 for the proposal.
>
> For the type names, can we come up with a different name from "
> base-file-task"? "base" is very Java abstract class specific. In fact,
> the StaticDataTask is not really scanning a file anyway, maybe we should
> just call these like file-scan-task, data-task, etc.?
>
> Best,
> Jack Ye
>
>
> On Wed, Feb 14, 2024 at 4:01 PM Ryan Blue <b...@tabular.io> wrote:
>
>> Thanks, Steven! Looks like the right direction to add other task types
>> with their own serialization.
>>
>> I hadn't realized that these were in the table spec and not just the REST
>> spec. What do you think about keeping JSON serialization that isn't part of
>> table metadata in the REST spec? I'm actually pretty happy with OpenAPI for
>> defining our JSON structures, so I think this would be easier in the REST
>> spec. I would also consider an OpenAPI extension to the table spec for JSON
>> objects since it is pretty easy to work with and does a good job defining
>> the metadata.
>>
>> Ryan
>>
>> On Wed, Feb 14, 2024 at 3:48 PM Steven Wu <stevenz...@gmail.com> wrote:
>>
>>> The first linked reference is the PR for spec update.
>>>
>>> [3] https://github.com/apache/iceberg/pull/9728
>>>
>>> On Wed, Feb 14, 2024 at 3:36 PM Steven Wu <stevenz...@gmail.com> wrote:
>>>
>>>> We just ran out of time and didn't get a chance to discuss this in the
>>>> community sync meeting today. Hence, I am raising the discussion here.
>>>>
>>>> We added JSON parsers for content file and file scan task a year ago
>>>> [1]. Recently, I just realized the implementation only handles
>>>> BaseFileScanTask. It would fail if the FileScanTask is some other
>>>> implementation (like StaticDataTask).
>>>>
>>>> Eduard, Anton, and I have been discussing a solution in issue-9597 [2].
>>>> We reached a consensus that we need to define a new `task-type` enum field
>>>> to indicate the implementation class/type [3]. For backward compatibility,
>>>> the lack of this new `task-type` field should be interpreted as
>>>> `base-file-task`.
>>>>
>>>> Since this is a spec change, Anton suggested more visibility. Hence I
>>>> am starting this discussion thread.
>>>>
>>>> [1] https://github.com/apache/iceberg/pull/6934
>>>> [2] https://github.com/apache/iceberg/issues/9597
>>>> [3] https://github.com/apache/iceberg/pull/9728
>>>>
>>>
>>
>> --
>> Ryan Blue
>> Tabular
>>
>

Reply via email to