@Ryan, the JSON serialization is also used by Flink for checkpoint state. so it is not purely a REST API thing.
@Jack, Ryan also had the same suggestion in the PR comment. I have updated the naming On Wed, Feb 14, 2024 at 4:08 PM Jack Ye <yezhao...@gmail.com> wrote: > > It would fail if the FileScanTask is some other implementation (like > StaticDataTask). > Actually we faced exactly the same issue, and we have an internal patch to > fix the parser for that. +1 for the proposal. > > For the type names, can we come up with a different name from " > base-file-task"? "base" is very Java abstract class specific. In fact, > the StaticDataTask is not really scanning a file anyway, maybe we should > just call these like file-scan-task, data-task, etc.? > > Best, > Jack Ye > > > On Wed, Feb 14, 2024 at 4:01 PM Ryan Blue <b...@tabular.io> wrote: > >> Thanks, Steven! Looks like the right direction to add other task types >> with their own serialization. >> >> I hadn't realized that these were in the table spec and not just the REST >> spec. What do you think about keeping JSON serialization that isn't part of >> table metadata in the REST spec? I'm actually pretty happy with OpenAPI for >> defining our JSON structures, so I think this would be easier in the REST >> spec. I would also consider an OpenAPI extension to the table spec for JSON >> objects since it is pretty easy to work with and does a good job defining >> the metadata. >> >> Ryan >> >> On Wed, Feb 14, 2024 at 3:48 PM Steven Wu <stevenz...@gmail.com> wrote: >> >>> The first linked reference is the PR for spec update. >>> >>> [3] https://github.com/apache/iceberg/pull/9728 >>> >>> On Wed, Feb 14, 2024 at 3:36 PM Steven Wu <stevenz...@gmail.com> wrote: >>> >>>> We just ran out of time and didn't get a chance to discuss this in the >>>> community sync meeting today. Hence, I am raising the discussion here. >>>> >>>> We added JSON parsers for content file and file scan task a year ago >>>> [1]. Recently, I just realized the implementation only handles >>>> BaseFileScanTask. It would fail if the FileScanTask is some other >>>> implementation (like StaticDataTask). >>>> >>>> Eduard, Anton, and I have been discussing a solution in issue-9597 [2]. >>>> We reached a consensus that we need to define a new `task-type` enum field >>>> to indicate the implementation class/type [3]. For backward compatibility, >>>> the lack of this new `task-type` field should be interpreted as >>>> `base-file-task`. >>>> >>>> Since this is a spec change, Anton suggested more visibility. Hence I >>>> am starting this discussion thread. >>>> >>>> [1] https://github.com/apache/iceberg/pull/6934 >>>> [2] https://github.com/apache/iceberg/issues/9597 >>>> [3] https://github.com/apache/iceberg/pull/9728 >>>> >>> >> >> -- >> Ryan Blue >> Tabular >> >