[ 
https://issues.apache.org/jira/browse/FLINK-13517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900431#comment-16900431
 ] 

Seth Wiesman edited comment on FLINK-13517 at 8/5/19 9:59 PM:
--------------------------------------------------------------

Copying [~phoenixjiangnan] 's comment from github

 
{noformat}
The original intention for hive_integration.md is to only focus on Flink-Hive 
interoperability, which is just using Flink to deal with native Hive. It should 
later have more Hive-compatibility related documentations , e.g. DDL, DML, to 
demonstrate how to migrate or achieve same Hive functionality or semantics in 
Flink. catalog.md is intentionally focused on metadata. This change seems to 
complicate the situation, and we may be forced to split the content again in 
the future when hive_integration.md becomes too large.

Merging the example page looks fine
{noformat}
It sounds to me like we have the same idea about how we want the docs to work 
but different ideas about how to get there.

In my mind, for catalog.md to be intentionally focused on metadata it should 
cover:
 * What catalogs are and what they do
 * What catalogs Flink provides
 * How to use them
 * Optionally, information about how to implement your own

Information such as hive <-> flink data type mapping and which specific hive 
versions and libraries are required do not fit in that scope. In theory Flink 
may support more built-in catalogs in the future and catalog md shouldn't need 
to be restructured to support that.

 

A hive specifc page can then cover these details and it keeps everything well 
organized so as the details of Flink's hive integration evolve they 
documentation does not need to be updated in multiple places. In the future 
hive_integration might be split into

 
{noformat}
hive:
-> overview.md
-> dml.md
-> ect {noformat}
but again, all the hive specific docs are in one place and I do not see 
splitting docs as being an overly difficult task. I would even be happy to 
split hive_integration into something like this now if you think it makes sense.

 

 


was (Author: sjwiesman):
Copying [~phoenixjiangnan] 's comment from github

 
{noformat}
The original intention for hive_integration.md is to only focus on Flink-Hive 
interoperability, which is just using Flink to deal with native Hive. It should 
later have more Hive-compatibility related documentations , e.g. DDL, DML, to 
demonstrate how to migrate or achieve same Hive functionality or semantics in 
Flink. catalog.md is intentionally focused on metadata. This change seems to 
complicate the situation, and we may be forced to split the content again in 
the future when hive_integration.md becomes too large.

Merging the example page looks fine
{noformat}
It sounds to me like we have the same idea about how we want the docs to work 
but different ideas about how to get there. 

In my mind, for catalog.md to be intentionally focused on metadata it should 
cover:
 * What catalogs are and what they do
 * What catalogs Flink provides
 * How to use them
 * Optionally, information about how to implement your own

Information such as hive <-> flink data type mapping and which specific hive 
versions and libraries are required do not fit in that scope. In theory Flink 
may support more built-in catalogs in the future and catalog md shouldn't need 
to be restructured to support that. 

 

A hive specifc page can then cover these details and it keeps everything well 
organized so as the details of Flink's hive integration evolve they 
documentation does not need to be updated in multiple places. In the future 
hive_integration might be split into

 
{noformat}
hive:
-> overview.md
-> dml.md
-> ect {noformat}
but again, all the hive specific docs are in one place and I do not see 
splitting docs as being an overly difficult task. If you event think we should 
have something that looks like this now I would be happy to add it.

 

 

> Restructure Hive Catalog documentation
> --------------------------------------
>
>                 Key: FLINK-13517
>                 URL: https://issues.apache.org/jira/browse/FLINK-13517
>             Project: Flink
>          Issue Type: Improvement
>          Components: Connectors / Hive, Documentation
>            Reporter: Seth Wiesman
>            Assignee: Seth Wiesman
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive documentation is currently spread across a number of pages and 
> fragmented. In particular: 
> 1) An example was added to getting-started/examples, however, this section is 
> being removed
> 2) There is a dedicated page on hive integration but also a lot of hive 
> specific information is on the catalog page
> We should
> 1) Inline the example into the hive integration page
> 2) Move the hive specific information on catalogs.md to hive_integration.md
> 3) Make catalogs.md be just about catalogs in general and link to the hive 
> integration. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to