> It would fail if the FileScanTask is some other implementation (like StaticDataTask). Actually we faced exactly the same issue, and we have an internal patch to fix the parser for that. +1 for the proposal.
For the type names, can we come up with a different name from " base-file-task"? "base" is very Java abstract class specific. In fact, the StaticDataTask is not really scanning a file anyway, maybe we should just call these like file-scan-task, data-task, etc.? Best, Jack Ye On Wed, Feb 14, 2024 at 4:01 PM Ryan Blue <b...@tabular.io> wrote: > Thanks, Steven! Looks like the right direction to add other task types > with their own serialization. > > I hadn't realized that these were in the table spec and not just the REST > spec. What do you think about keeping JSON serialization that isn't part of > table metadata in the REST spec? I'm actually pretty happy with OpenAPI for > defining our JSON structures, so I think this would be easier in the REST > spec. I would also consider an OpenAPI extension to the table spec for JSON > objects since it is pretty easy to work with and does a good job defining > the metadata. > > Ryan > > On Wed, Feb 14, 2024 at 3:48 PM Steven Wu <stevenz...@gmail.com> wrote: > >> The first linked reference is the PR for spec update. >> >> [3] https://github.com/apache/iceberg/pull/9728 >> >> On Wed, Feb 14, 2024 at 3:36 PM Steven Wu <stevenz...@gmail.com> wrote: >> >>> We just ran out of time and didn't get a chance to discuss this in the >>> community sync meeting today. Hence, I am raising the discussion here. >>> >>> We added JSON parsers for content file and file scan task a year ago >>> [1]. Recently, I just realized the implementation only handles >>> BaseFileScanTask. It would fail if the FileScanTask is some other >>> implementation (like StaticDataTask). >>> >>> Eduard, Anton, and I have been discussing a solution in issue-9597 [2]. >>> We reached a consensus that we need to define a new `task-type` enum field >>> to indicate the implementation class/type [3]. For backward compatibility, >>> the lack of this new `task-type` field should be interpreted as >>> `base-file-task`. >>> >>> Since this is a spec change, Anton suggested more visibility. Hence I am >>> starting this discussion thread. >>> >>> [1] https://github.com/apache/iceberg/pull/6934 >>> [2] https://github.com/apache/iceberg/issues/9597 >>> [3] https://github.com/apache/iceberg/pull/9728 >>> >> > > -- > Ryan Blue > Tabular >