LeonBein commented on a change in pull request #15109:
URL: https://github.com/apache/flink/pull/15109#discussion_r592511610



##########
File path: flink-connectors/flink-connector-hbase/README.md
##########
@@ -0,0 +1,134 @@
+# Flink HBase Connector
+
+This connector provides classes that allow Flink to access 
[HBase](https://hbase.apache.org/). 
+It supports the new Source and Sink API specified in 
[FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
 and 
[FLIP-143](https://cwiki.apache.org/confluence/display/FLINK/FLIP-143%3A+Unified+Sink+API).
+
+## Building the connector
+
+Note that the streaming connectors are not part of the binary distribution of 
Flink.
+You need to link them into your job jar for cluster execution.
+See how to link with them for cluster execution 
[here](https://ci.apache.org/projects/flink/flink-docs-stable/dev/project-configuration.html#adding-connector-and-library-dependencies).
+
+The connector can be built by using maven:
+
+```
+cd flink-connectors/flink-connector-hbase
+mvn clean install
+```
+
+## Installing HBase
+
+Follow the instructions from the [HBase Quick Start 
Guide](http://hbase.apache.org/book.html#quickstart) to install HBase.
+
+*Version Compatibility*: This module is compatible with Apache HBase *2.3.4*.
+
+## HBase Configuration
+
+Connecting to HBase always requires a `Configuration` instance.
+If there is an HBase gateway on the same host as the Flink gateway where the 
application is started, this can be obtained by invoking 
`HBaseConfiguration.create()` as in the examples below.
+If that's not the case a configuration should be provided where the proper 
core-site, hdfs-site, and hbase-site are added as resources.
+
+## DataStream API
+
+### Reading data from HBase
+
+To receive data from HBase, the connector makes use of the internal 
replication mechanism of HBase. 
+The connector registers at the HBase cluster as a *Replication Peer* and will 
receive all change events from HBase.
+
+For the replication to work, the HBase config needs to have replication 
enabled in the `hbase-site.xml` file:
+```xml
+<configuration>
+  <property>
+    <name>hbase.replication</name>
+    <value>true</value>
+  </property>
+  ...
+</configuration>
+```
+All incoming events to Flink will be processed as an `HBaseEvent`. 
+You will need to specify a Deserializer which will transform each event from 
an `HBaseEvent` to the desired DataStream type.
+
+```java
+StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
+
+Configuration hbaseConfig = HBaseConfiguration.create();
+hbaseConfig.setBoolean("hbase.replication", true);

Review comment:
       Removed in 047d43e21488af3a8f405e8e6c01e502721fd80f




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to