[ https://issues.apache.org/jira/browse/FLINK-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900431#comment-16900431 ]
Seth Wiesman edited comment on FLINK-13517 at 8/5/19 9:59 PM: -------------------------------------------------------------- Copying [~phoenixjiangnan] 's comment from github {noformat} The original intention for hive_integration.md is to only focus on Flink-Hive interoperability, which is just using Flink to deal with native Hive. It should later have more Hive-compatibility related documentations , e.g. DDL, DML, to demonstrate how to migrate or achieve same Hive functionality or semantics in Flink. catalog.md is intentionally focused on metadata. This change seems to complicate the situation, and we may be forced to split the content again in the future when hive_integration.md becomes too large. Merging the example page looks fine {noformat} It sounds to me like we have the same idea about how we want the docs to work but different ideas about how to get there. In my mind, for catalog.md to be intentionally focused on metadata it should cover: * What catalogs are and what they do * What catalogs Flink provides * How to use them * Optionally, information about how to implement your own Information such as hive <-> flink data type mapping and which specific hive versions and libraries are required do not fit in that scope. In theory Flink may support more built-in catalogs in the future and catalog md shouldn't need to be restructured to support that. A hive specifc page can then cover these details and it keeps everything well organized so as the details of Flink's hive integration evolve they documentation does not need to be updated in multiple places. In the future hive_integration might be split into {noformat} hive: -> overview.md -> dml.md -> ect {noformat} but again, all the hive specific docs are in one place and I do not see splitting docs as being an overly difficult task. I would even be happy to split hive_integration into something like this now if you think it makes sense. was (Author: sjwiesman): Copying [~phoenixjiangnan] 's comment from github {noformat} The original intention for hive_integration.md is to only focus on Flink-Hive interoperability, which is just using Flink to deal with native Hive. It should later have more Hive-compatibility related documentations , e.g. DDL, DML, to demonstrate how to migrate or achieve same Hive functionality or semantics in Flink. catalog.md is intentionally focused on metadata. This change seems to complicate the situation, and we may be forced to split the content again in the future when hive_integration.md becomes too large. Merging the example page looks fine {noformat} It sounds to me like we have the same idea about how we want the docs to work but different ideas about how to get there. In my mind, for catalog.md to be intentionally focused on metadata it should cover: * What catalogs are and what they do * What catalogs Flink provides * How to use them * Optionally, information about how to implement your own Information such as hive <-> flink data type mapping and which specific hive versions and libraries are required do not fit in that scope. In theory Flink may support more built-in catalogs in the future and catalog md shouldn't need to be restructured to support that. A hive specifc page can then cover these details and it keeps everything well organized so as the details of Flink's hive integration evolve they documentation does not need to be updated in multiple places. In the future hive_integration might be split into {noformat} hive: -> overview.md -> dml.md -> ect {noformat} but again, all the hive specific docs are in one place and I do not see splitting docs as being an overly difficult task. If you event think we should have something that looks like this now I would be happy to add it. > Restructure Hive Catalog documentation > -------------------------------------- > > Key: FLINK-13517 > URL: https://issues.apache.org/jira/browse/FLINK-13517 > Project: Flink > Issue Type: Improvement > Components: Connectors / Hive, Documentation > Reporter: Seth Wiesman > Assignee: Seth Wiesman > Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Hive documentation is currently spread across a number of pages and > fragmented. In particular: > 1) An example was added to getting-started/examples, however, this section is > being removed > 2) There is a dedicated page on hive integration but also a lot of hive > specific information is on the catalog page > We should > 1) Inline the example into the hive integration page > 2) Move the hive specific information on catalogs.md to hive_integration.md > 3) Make catalogs.md be just about catalogs in general and link to the hive > integration. -- This message was sent by Atlassian JIRA (v7.6.14#76016)