[ 
https://issues.apache.org/jira/browse/FLINK-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15765160#comment-15765160
 ] 

Ivan Mushketyk commented on FLINK-5280:
---------------------------------------

Hi Fabian,

Thank you for your reply.

At first a question about your comment.

{quote}
In case of a Specific Avro record, we would need an additional step to copy the 
first-level Pojo fields into a Row
{quote}

Does "Specific Avro" mean a regular POJO?

Regarding the *TableSource* interface, I think I've lost track of what problem 
we are trying to solve here :)

I see the following problems with the current interface:
* There is no explicit relationship between fields positions in a *Row* and 
order of fields in a POJO type. As you mentioned, we can get fields order via 
*PojoTypeInfo.getFieldIndex()*. Since *TableSource* has a method 
*getReturnType* that returns *TypeInformation*, there's nothing that should be 
changed about the *TableSource* interface to support it.
* Row type does not have field names which make it problematic to access nested 
fields in nested Rows, but I believe this should be fixed in  FLINK-5348.

Therefore it seems that the only thing that should be done (except waiting for  
FLINK-5348 to be implemented) is to update *TableSourceTable* to use POJO 
fields in a correct order. Currently, it just generates indexes 0 to n:

{code}
class TableSourceTable(val tableSource: TableSource[_])
  extends FlinkTable[Row](
    typeInfo = new RowTypeInfo(tableSource.getFieldTypes),
    fieldIndexes = 0.until(tableSource.getNumberOfFields).toArray,
    fieldNames = tableSource.getFieldsNames)
{code}

while it should use *PojoTypeInfo.getFieldIndex()* method to build a proper 
list of fields indexes.

Am I missing something? Are there are some *TableSource* limitations that I am 
missing?



> Extend TableSource to support nested data
> -----------------------------------------
>
>                 Key: FLINK-5280
>                 URL: https://issues.apache.org/jira/browse/FLINK-5280
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table API & SQL
>    Affects Versions: 1.2.0
>            Reporter: Fabian Hueske
>            Assignee: Ivan Mushketyk
>
> The {{TableSource}} interface does currently only support the definition of 
> flat rows. 
> However, there are several storage formats for nested data that should be 
> supported such as Avro, Json, Parquet, and Orc. The Table API and SQL can 
> also natively handle nested rows.
> The {{TableSource}} interface and the code to register table sources in 
> Calcite's schema need to be extended to support nested data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to