[kylin] branch document updated: Fix typos (#1553)

xxyu Fri, 22 Jan 2021 04:41:47 -0800

This is an automated email from the ASF dual-hosted git repository.

xxyu pushed a commit to branch document
in repository https://gitbox.apache.org/repos/asf/kylin.git



The following commit(s) were added to refs/heads/document by this push:
     new f35dfb7  Fix typos (#1553)
f35dfb7 is described below

commit f35dfb7178ef1781af420063ffea89c8eb254bda
Author: Helen Zeng <[email protected]>
AuthorDate: Fri Jan 22 20:41:30 2021 +0800

    Fix typos (#1553)
    
    * fix typos
    
    * grammar fix
---
 website/_docs/gettingstarted/faq.md       | 58 +++++++++++++++----------------
 website/_docs/install/Kylin_kubernetes.md | 44 +++++++++++------------
 website/_docs/install/kylin_cluster.cn.md |  4 +--
 website/_docs/install/kylin_cluster.md    |  8 ++---
 4 files changed, 57 insertions(+), 57 deletions(-)

diff --git a/website/_docs/gettingstarted/faq.md 
b/website/_docs/gettingstarted/faq.md
index 9305045..5c5a403 100644
--- a/website/_docs/gettingstarted/faq.md
+++ b/website/_docs/gettingstarted/faq.md
@@ -14,15 +14,15 @@ There is an article about [how to ask a question in a smart 
way](http://catb.org
 
 #### Is Kylin a generic SQL engine for big data?
 
-  * No, Kylin is an OLAP engine with SQL interface. The SQL queries need be 
matched with the pre-defined OLAP model.
+  * No, Kylin is an OLAP engine with SQL interface. The SQL queries should be 
matched with the pre-defined OLAP model.
 
 #### What's a typical scenario to use Apache Kylin?
 
-  * Kylin can be the best option if you have a huge table (e.g., >100 million 
rows), join with lookup tables, while queries need be finished in the second 
level (dashboards, interactive reports, business intelligence, etc), and the 
concurrent users can be dozens or hundreds.
+  * Kylin can be the best option if you have a huge table (e.g., >100 million 
rows), join with lookup tables, while queries need to be finished in the second 
level (dashboards, interactive reports, business intelligence, etc), and the 
concurrent users can be dozens or hundreds.
 
 #### How large a data scale can Kylin support? How about the performance?
 
-  * Kylin can supports second level query performance at TB to PB level 
dataset. This has been verified by users like eBay, Meituan, Toutiao. Take 
Meituan's case as an example (till 2018-08), 973 cubes, 3.8 million queries per 
day, raw data 8.9 trillion, total cube size 971 TB (original data is bigger), 
50% queries finished in < 0.5 seconds, 90% queries < 1.2 seconds.
+  * Kylin can support second level query performance at TB to PB level 
dataset. This has been verified by users like eBay, Meituan, Toutiao. Take 
Meituan's case as an example (till 2018-08), 973 cubes, 3.8 million queries per 
day, raw data 8.9 trillion, total cube size 971 TB (original data is bigger), 
50% of the queries finished in < 0.5 seconds, 90% queries < 1.2 seconds.
 
 #### Who are using Apache Kylin?
 
@@ -34,11 +34,11 @@ There is an article about [how to ask a question in a smart 
way](http://catb.org
 
 #### How to compare Kylin with other SQL engines like Hive, Presto, Spark SQL, 
Impala?
 
-  * They answer a query in different ways. Kylin is not a replacement for 
them, but a supplement (query accelerator). Many users run Kylin together with 
other SQL engines. For the high frequent query patterns, building Cubes can 
greatly improve the performance and also offload cluster workloads. For less 
queried patterns or ad-hoc queries, ther MPP engines are more flexible.
+  * They answer a query in different ways. Kylin is not a replacement for 
them, but a supplement (query accelerator). Many users run Kylin together with 
other SQL engines. For the high frequent query patterns, building Cubes can 
greatly improve the performance and also offload cluster workloads. For less 
queried patterns or ad-hoc queries, other MPP engines are more flexible.
   
 #### How to compare Kylin with Druid?
 
-  * Druid is more suitable for real-time analysis. Kylin is more focus on OLAP 
case. Druid has good integration with Kafka as real-time streaming; Kylin 
fetches data from Hive or Kafka in batches. The real-time capability of Kylin 
is still under development.
+  * Druid is more suitable for real-time analysis. Kylin is more focus on OLAP 
case. Druid has a good integration with Kafka as real-time streaming; Kylin 
fetches data from Hive or Kafka in batches. The real-time capability of Kylin 
is still under development.
 
   * Many internet service providers host both Druid and Kylin, serving 
different purposes (real-time and historical).
 
@@ -58,13 +58,13 @@ There is an article about [how to ask a question in a smart 
way](http://catb.org
 
 #### How many dimensions can be in a cube?
 
-  * The max physical dimension number (exclude derived column in lookup 
tables) in a cube is 63; If you can normalize some dimensions to lookup tables, 
with derived dimensions, you can create a cube with more than 100 dimensions.
+  * The max physical dimension number (exclude derived columns in lookup 
tables) in a cube is 63; If you can normalize some dimensions to lookup tables, 
with derived dimensions, you can create a cube with more than 100 dimensions.
 
   * But a cube with > 30 physical dimensions is not recommended; You even 
couldn't save that in Kylin if you don't optimize the aggregation groups. 
Please search "curse of dimensionality".
 
-#### Why I got an error when running a "select * " query?
+#### Why do I got an error when running a "select * " query?
 
-  * The cube only has aggregated data, so all your queries should be 
aggregated queries ("GROUP BY"). You can use a SQL with all dimensions be 
grouped to get them as close as the detailed result, but that is not the raw 
data.
+  * The cube has only the aggregated data, so all your queries should be 
aggregated queries ("GROUP BY"). You can use a SQL with all dimensions be 
grouped to get them as close as the detailed result, but that is not the raw 
data.
 
   * In order to be connected from some BI tools, Kylin tries to answer "select 
\*" query but please aware the result might not be expected. Please make sure 
each query to Kylin is aggregated.
 
@@ -76,9 +76,9 @@ But if you do want, there are some workarounds. 1) Add the 
primary key as a dime
 
 #### What is the UHC dimension?
 
-  * UHC means Ultra High Cardinality. Cardinality means the number of distinct 
values of a dimension. Usually, a dimension's cardinality is from tens to 
millions. If above million, we call it a UHC dimension, for example, user id, 
cell number, etc.
+  * UHC means Ultra High Cardinality. Cardinality means the number of distinct 
values of a dimension. Usually, a dimension's cardinality is from tens to 
millions. If above a million, we call it a UHC dimension, for example, user id, 
cell number, etc.
 
-  * Kylin supports UHC dimension but you need to pay attention to UHC 
dimension, especially the encoding and the cuboid combinations. It may cause 
your Cube very large and query to be slow.
+  * Kylin supports UHC dimension, but you need to pay attention to UHC 
dimensions, especially the encodings and the cuboid combinations. It may cause 
your Cube to be very large and query to be slow.
 
 #### Can I specify a cube to answer my SQL statements?
 
@@ -103,11 +103,11 @@ But if you do want, there are some workarounds. 1) Add 
the primary key as a dime
 
 #### How to encrypt cube data?
 
-  * You can enable encryption at HBase side. Refer 
https://hbase.apache.org/book.html#hbase.encryption.server for more details.
+  * You can enable encryption at HBase side. Refer to 
https://hbase.apache.org/book.html#hbase.encryption.server for more details.
 
 #### How to schedule the cube build at a fixed frequency, in an automatic way?
 
-  * Kylin doesn't have a built-in scheduler for this. You can trigger that 
through Rest API from external scheduler services, like Linux cron job, Apache 
Airflow, etc.
+  * Kylin doesn't have a built-in scheduler for this. You can trigger that 
through the Rest API from external scheduler services, like Linux cron job, 
Apache Airflow, etc.
 
 #### How to export/import cube/project across different Kylin environments?
 
@@ -123,7 +123,7 @@ But if you do want, there are some workarounds. 1) Add the 
primary key as a dime
 
 #### The Cube is ready, but why the table does not appear in the "Insight" tab?
 
-  * Make sure the "kylin.server.cluster-servers" property in 
`conf/kylin.properties` is configured with EVERY Kylin node, all job and query 
nodes. Kylin nodes notify each other to flush cache with this configuration. 
And please ensure the network among them are healthy.
+  * Make sure the "kylin.server.cluster-servers" property in 
`conf/kylin.properties` is configured with EVERY Kylin node, all job and query 
nodes. Kylin nodes notify each other to flush cache with this configuration. 
Also, please ensure that the network among them are healthy.
 
 #### What should I do if I encounter a "java.lang.NoClassDefFoundError" error?
 
@@ -135,7 +135,7 @@ But if you do want, there are some workarounds. 1) Add the 
primary key as a dime
 
 #### How to add dimension/measure to a cube?
 
-  * Once a cube is built, its structure couldn't be modified. To add 
dimension/measure, you need to clone a new cube, and then add in it.
+  * Once a cube is built, its structure cannot be modified. To add a 
dimension/measure, you need to clone a new cube, and then add to it.
 
 When the new cube is built, please disable or drop the old one.
 
@@ -183,23 +183,23 @@ kylin.engine.spark-conf.spark.yarn.queue=YOUR_QUEUE_NAME
 
 #### How to add a new JDBC data source dialect?
 
-  * That is easy to add a new type of JDBC data source. You can follow such 
steps:
+  * It is easy to add a new type of JDBC data source. You can follow such 
steps:
 
 1) Add the dialect in  
source-hive/src/main/java/org/apache/kylin/source/jdbc/JdbcDialect.java
 
 2) Implement a new IJdbcMetadata if {database that you want to add}'s metadata 
fetching is different with others and then register it in JdbcMetadataFactory
 
-3) You may need to customize the SQL for creating/dropping table in 
JdbcExplorer for {database that you want to add}.
+3) You may need to customize the SQL for creating/dropping tables in 
JdbcExplorer for {database that you want to add}.
 
 #### How to ask a question?
 
-  * Check Kylin documents first. and do a Google search also can help. 
Sometimes the question has been answered so you don't need ask again. If no 
matching, please send your question to Apache Kylin user mailing list: 
[email protected]; You need to drop an email to 
[email protected] to subscribe if you haven't done so. In the 
email content, please provide your Kylin and Hadoop version, specific error 
logs (as much as possible), and also the how to re-produce steps.  
+  * Check Kylin documents first, and doing a Google search can also help. 
Sometimes the question has been answered, so you don't need ask again. If no 
matching, please send your question to Apache Kylin user mailing list: 
[email protected]; You need to drop an email to 
[email protected] to subscribe if you haven't done so. In the 
email content, please provide your Kylin and Hadoop version, specific error 
logs (as much as possible), and also the how to re-produce steps.  
 
 #### "bin/find-hive-dependency.sh" can locate hive/hcat jars in local, but 
Kylin reports error like "java.lang.NoClassDefFoundError: 
org/apache/hive/hcatalog/mapreduce/HCatInputFormat" or 
"java.lang.NoClassDefFoundError: org/apache/hadoop/hive/ql/session/SessionState"
 
-  * Kylin need many dependent jars (hadoop/hive/hcat/hbase/kafka) on classpath 
to work, but Kylin doesn't ship them. It will seek these jars from your local 
machine by running commands like `hbase classpath`, `hive -e set` etc. The 
founded jars' path will be appended to the environment variable 
*HBASE_CLASSPATH* (Kylin uses `hbase` shell command to start up, which will 
read this). But in some Hadoop distribution (like AWS EMR 5.0), the `hbase` 
shell doesn't keep the origin `HBASE_CLASSPA [...]
+  * Kylin needs many dependent jars (hadoop/hive/hcat/hbase/kafka) on 
classpath to work, but Kylin doesn't ship them. It will seek these jars from 
your local machine by running commands like `hbase classpath`, `hive -e set` 
etc. The founded jars' path will be appended to the environment variable 
*HBASE_CLASSPATH* (Kylin uses `hbase` shell command to start up, which will 
read this). But in some Hadoop distribution (like AWS EMR 5.0), the `hbase` 
shell doesn't keep the origin `HBASE_CLASSP [...]
 
-  * To fix this, find the hbase shell script (in hbase/bin folder), and search 
*HBASE_CLASSPATH*, check whether it overwrite the value like :
+  * To fix this, find the hbase shell script (in hbase/bin folder), and search 
*HBASE_CLASSPATH*, check whether it overwrites the value like :
 
   {% highlight Groff markup %}
   export 
HBASE_CLASSPATH=$HADOOP_CONF:$HADOOP_HOME/*:$HADOOP_HOME/lib/*:$ZOOKEEPER_HOME/*:$ZOOKEEPER_HOME/lib/*
@@ -213,7 +213,7 @@ kylin.engine.spark-conf.spark.yarn.queue=YOUR_QUEUE_NAME
 
 #### Get "java.lang.IllegalArgumentException: Too high cardinality is not 
suitable for dictionary -- cardinality: 5220674" in "Build Dimension 
Dictionary" step
 
-  * Kylin uses "Dictionary" encoding to encode/decode the dimension values 
(check [this blog](/blog/2015/08/13/kylin-dictionary/)); Usually a dimension's 
cardinality is less than millions, so the "Dict" encoding is good to use. As 
dictionary need be persisted and loaded into memory, if a dimension's 
cardinality is very high, the memory footprint will be tremendous, so Kylin add 
a check on this. If you see this error, suggest to identify the UHC dimension 
first and then re-evaluate the de [...]
+  * Kylin uses "Dictionary" encoding to encode/decode the dimension values 
(check [this blog](/blog/2015/08/13/kylin-dictionary/)); Usually a dimension's 
cardinality is less than millions, so the "Dict" encoding is good to use. As 
dictionary need to be persisted and loaded into memory, if a dimension's 
cardinality is very high, the memory footprint will be tremendous, so Kylin add 
a check on this. If you see this error, please identify the UHC dimension first 
and then re-evaluate the des [...]
 
 
 #### How to Install Kylin on CDH 5.2 or Hadoop 2.5.x
@@ -221,7 +221,7 @@ kylin.engine.spark-conf.spark.yarn.queue=YOUR_QUEUE_NAME
   * Check out discussion: 
[https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/kylin-olap/X0GZfsX1jLc/nzs6xAhNpLkJ](https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/kylin-olap/X0GZfsX1jLc/nzs6xAhNpLkJ)
 
   {% highlight Groff markup %}
-  I was able to deploy Kylin with following option in POM.
+  I was able to deploy Kylin with the following options in POM.
   <hadoop2.version>2.5.0</hadoop2.version>
   <yarn.version>2.5.0</yarn.version>
   <hbase-hadoop2.version>0.98.6-hadoop2</hbase-hadoop2.version>
@@ -232,13 +232,13 @@ kylin.engine.spark-conf.spark.yarn.queue=YOUR_QUEUE_NAME
 
 
 #### SUM(field) returns a negative result while all the numbers in this field 
are > 0
-  * If a column is declared as integer in Hive, the SQL engine (calcite) will 
use column's type (integer) as the data type for "SUM(field)", while the 
aggregated value on this field may exceed the scope of integer; in that case 
the cast will cause a negtive value be returned; The workaround is, alter that 
column's type to BIGINT in hive, and then sync the table schema to Kylin (the 
cube doesn't need rebuild); Keep in mind that, always declare as BIGINT in hive 
for an integer column which [...]
+  * If a column is declared as integer in Hive, the SQL engine (calcite) will 
use column's type (integer) as the data type for "SUM(field)". While the 
aggregated value on this field may exceed the scope of integer, the cast will 
cause a negative value be returned. The workaround is, alter that column's type 
to BIGINT in hive, and then sync the table schema to Kylin (the cube doesn't 
need rebuild); Keep in mind that, always declare as BIGINT in hive for an 
integer column which would be us [...]
 
-#### Why Kylin need extract the distinct columns from Fact Table before 
building cube?
-  * Kylin uses dictionary to encode the values in each column, this greatly 
reduce the cube's storage size. To build the dictionary, Kylin need fetch the 
distinct values for each column.
+#### Why does Kylin need to extract the distinct columns from Fact Table 
before building cube?
+  * Kylin uses dictionary to encode the values in each column, this 
significantly reduces the cube's storage size. To build the dictionary, Kylin 
needs to fetch the distinct values for each column.
 
-#### Why Kylin calculate the HIVE table cardinality?
-  * The cardinality of dimensions is an important measure of cube complexity. 
The higher the cardinality, the bigger the cube, and thus the longer to build 
and the slower to query. Cardinality > 1,000 is worth attention and > 1,000,000 
should be avoided at best effort. For optimal cube performance, try reduce high 
cardinality by categorize values or derive features.
+#### Why does Kylin calculate the HIVE table cardinality?
+  * The cardinality of dimensions is an important measure of the cube 
complexity. The higher the cardinality, the bigger the cube, and thus the 
longer to build, and the slower to query. Cardinality > 1,000 is worth 
attention and > 1,000,000 should be avoided at best effort. For optimal cube 
performance, try to reduce high cardinality by categorize values or derive 
features.
 
 #### How to add new user or change the default password?
   * Please check the document: [How to add new user or change the default 
password](https://cwiki.apache.org/confluence/display/KYLIN/How+to+add+new+user+or+change+the+default+password).
@@ -281,18 +281,18 @@ group by a.slr_sgmt
   [http://npm.taobao.org](http://npm.taobao.org)
 
 #### Failed to run BuildCubeWithEngineTest, saying failed to connect to hbase 
while hbase is active
-  * User may get this error when first time run hbase client, please check the 
error trace to see whether there is an error saying couldn't access a folder 
like "/hadoop/hbase/local/jars"; If that folder doesn't exist, create it.
+  * User may get this error when running hbase client the first time, please 
check the error trace to see whether there is an error saying couldn't access a 
folder like "/hadoop/hbase/local/jars"; If that folder doesn't exist, create it.
 
 #### Kylin JDBC driver returns a different Date/time than the REST API, seems 
it add the timezone to parse the date.
   * Please check the [post in mailing 
list](http://apache-kylin.74782.x6.nabble.com/JDBC-query-result-Date-column-get-wrong-value-td5370.html)
 
 
-#### What kind of data be left in 'kylin.env.hdfs-working-dir' ? We often 
execute kylin cleanup storage command, but now our working dir folder is about 
300 GB size, can we delete old data manually?
+#### What kind of data is left in 'kylin.env.hdfs-working-dir' ? We often 
execute kylin cleanup storage command, but now our working dir folder is about 
300 GB size, can we delete old data manually?
 
   * The data in 'hdfs-working-dir' ('hdfs:///kylin/kylin_metadata/' by 
default) includes intermediate files (will be GC) and Cuboid data (won't be 
GC). The Cuboid data is kept for the further segments' merge, as Kylin couldn't 
merge from HBase. If you're sure those segments won't be merged, you can move 
them to other paths or even delete.
 
   * Please pay attention to the "resources" or "jdbc-resources" sub-folder 
under '/kylin/kylin_metadata/', which persists big metadata files like 
dictionaries and lookup tables' snapshots. They shouldn't be manually moved.
 
 #### How to escape the key word in fuzzy match (like) queries?
-"%", "_" are key words in the "like" clause; "%" matches any character, and 
"_" matches a single character; When you wants to match the key word like "_", 
need to escape them with another character ahead; Below is a sample with "/" to 
escape, the query is to match the "xiao_":
+"%", "_" are keywords in the "like" clause; "%" matches any character, and "_" 
matches a single character; When you want to match the keyword like "_", you 
need to escape them with another character ahead; Below is a sample with "/" to 
escape, the query is to match the "xiao_":
 "select username from gg_user where username like '%xiao/_%' escape '/'; "  
\ No newline at end of file
diff --git a/website/_docs/install/Kylin_kubernetes.md 
b/website/_docs/install/Kylin_kubernetes.md
index 0508963..e6b8e8a 100644
--- a/website/_docs/install/Kylin_kubernetes.md
+++ b/website/_docs/install/Kylin_kubernetes.md
@@ -6,9 +6,9 @@ permalink: /docs/install/kylin_on_kubernetes.html
 since: v3.0.2
 ---
 
-Kubernetes is a portable, extensible, open-source platform for managing 
containerized workloads and services, that facilitates both declarative 
configuration and automation. It has a large, rapidly growing ecosystem. 
Kubernetes services, support, and tools are widely available.
+Kubernetes is a portable, extensible, open-source platform for managing 
containerized workloads and services, it facilitates both declarative 
configuration and automation. It has a large, rapidly growing ecosystem. 
Kubernetes services, support and tools are widely available.
 
-Apache Kylin is a open source, distributed analytical data warehouse for big 
data. Deploy Kylin on Kubernetes cluster, will reduce cost of maintenance and 
extension.
+Apache Kylin is an open source, distributed analytical data warehouse for big 
data. Deploy Kylin on Kubernetes cluster will reduce cost of maintenance and 
extension.
 
 ## Directory
 Visit and download https://github.com/apache/kylin/tree/master/kubernetes and 
you will find three directory:
@@ -16,11 +16,11 @@ Visit and download 
https://github.com/apache/kylin/tree/master/kubernetes and yo
 - **config** 
  Please update your configuration file here.
 - **template** 
- This directory provided two deployment templates, one for quick-start 
purpose, another for production/distributed deployment.
-    - Quick-start template is for one node deployment with an ALL kylin 
instance.
-    - Production template is for multi-nodes deployment with a few of 
job/query kylin instances; and some other service like memcached and 
filebeat(check doc at [ELK stack](https://www.elastic.co/what-is/elk-stack)) 
will help to satisfy log collection/query cache/session sharing demand.
+ This directory provides two deployment templates, one for quick-start 
purpose, another for production/distributed deployment.
+    - The quick-start template is for one node deployment with an ALL kylin 
instance.
+    - The production template is for multi-nodes deployment with a few 
job/query kylin instances. Moreover, some other services like memcached and 
filebeat(check doc at [ELK stack](https://www.elastic.co/what-is/elk-stack)) 
will help to satisfy log collection/query cache/session sharing demand.
 - **docker** 
- Docker image is the pre-requirement of Kylin on Kubernetes, please check this 
directory if you need build it yourself. For CDH5.x user, you may consider use 
a provided image on DockerHub.
+ Docker image is the pre-requirement of Kylin on Kubernetes, please check this 
directory if you need to build it yourself. For CDH5.x user, you may consider 
using a provided image on DockerHub.
  
 ---
  
@@ -29,45 +29,45 @@ Visit and download 
https://github.com/apache/kylin/tree/master/kubernetes and yo
 1. A hadoop cluster.
 2. A K8s cluster, with sufficient system resources.
 3. **kylin-client** image.
-4. A Elasticsearch cluster(maybe optional).
+4. An Elasticsearch cluster(maybe optional).
 
 ## How to build docker image
 
 ### Hadoop-client image
 
-What is hadoop-client docker image and why we need this?
+What is a hadoop-client docker image and why do we need this?
 
-As we all know, the node you want to deploy Kylin, should contains Hadoop 
dependency(jars and configuration files), these dependency let you have access 
to Hadoop Service, such as HDFS, HBase, Hive, which are needed by Apache Kylin. 
Unfortunately, each Hadoop distribution(CHD or HDP etc.) has its own specific 
jars. So, we can build specific image for specific Hadoop distribution, which 
will make image management task more easier. This will have following two 
benefits:
+As we all know, the node you want to deploy Kylin should contain Hadoop 
dependencies(jars and configuration files), these dependencies let you have 
access to Hadoop Services, such as HDFS, HBase, Hive, which are needed by 
Apache Kylin. Unfortunately, each Hadoop distribution(CHD or HDP etc.) has its 
own specific jars. So, we can build specific images for specific Hadoop 
distributions, which will make image management task easier. This will have the 
following two benefits:
 
-- Someone who has better knowledge on Hadoop can do this work, and let kylin 
user build their Kylin image base on provided Hadoop-Client image.
+- Someone who has more knowledge on Hadoop can do this work, and let kylin 
users build their Kylin image base on provided Hadoop-Client image.
 - Upgrade Kylin will be much easier.
 
 Build Step
-- Prepare and modify Dockerfile(If you are using other hadoop distribution, 
please consider build image yourself). 
+- Prepare and modify Dockerfile(If you are using other hadoop distribution, 
please consider build an image yourself). 
 - Place Spark binary(such as `spark-2.3.2-bin-hadoop2.7.tgz`) into dir 
`provided-binary`.
-- Run `build-image.sh` to build image.
+- Run `build-image.sh` to build the image.
 
 ### Kylin-client image
  
-What is kylin-client docker images? 
+What is a kylin-client docker image? 
 
-**kylin-client** is a docker image which based on **hadoop-client**, it will 
provided the flexibility of upgrade of Apache Kylin.
+**kylin-client** is a docker image which based on **hadoop-client**, it will 
provide the flexibility of upgrade of Apache Kylin.
 
 Build Step
 
 - Place Kylin binary(such as `apache-kylin-3.0.1-bin-cdh57.tar.gz`) and 
uncompress it into current dir.
-- Modify `Dockerfile` , change the value of `KYLIN_VERSION` and name of base 
image(hadoop-client).
+- Modify `Dockerfile` , change the value of `KYLIN_VERSION` and the name of 
base image(hadoop-client).
 - Run `build-image.sh` to build image.
 
 ----
 
 ## How to deploy kylin on kubernetes
 
-Here let's take a look of how to deploy a kylin cluster which connect to CDH 
5.7.
+Here let's take a look at how to deploy a kylin cluster which connects to CDH 
5.7.
 
 1 `kubenetes/template/production/example/deployment` is the working directory.
 
-2 Update hadoop configuration files 
(`kubenetes/template/production/example/config/hadoop`) and filebeat 's 
configuration file.
+2 Update hadoop configuration files 
(`kubenetes/template/production/example/config/hadoop`) and filebeat's 
configuration file.
 
 3 Create statefulset and service for memcached.
 
@@ -79,7 +79,7 @@ service/cache-svc created
 statefulset.apps/kylin-memcached created
 ```
 
-- Check hostname of cache service.
+- Check the hostname of cache service.
 
 ``` 
 $ kubectl run -it--image=busybox:1.28.4--rm--restart=Never sh -n test-dns
@@ -105,7 +105,7 @@ $ vim ../config/kylin-job/kylin.properties
 $ vim ../config/kylin-query/kylin.properties
 ```
 
-- Create configMap
+- Create the configMap
 
 ``` 
 $ kubectl create configmap -n kylin-example hadoop-config \
@@ -206,8 +206,8 @@ $ kubectl exec -it  kylin-job-0  -n kylin-example-- bash
 $ kubectl get pod kylin-job-0  -n kylin-example -o yaml
 ```
 
-- If you don't have a Elasticsearch cluster or not interested in log 
collection, please remove filebeat container in both kylin-query-stateful.yaml 
and kylin-job-stateful.yaml.
+- If you don't have an Elasticsearch cluster or not interested in log 
collection, please remove filebeat container in both kylin-query-stateful.yaml 
and kylin-job-stateful.yaml.
 
-- If you want to check detail or want to have a discussion, please read or 
comment on [KYLIN-4447 Kylin on kubernetes in production 
env](https://issues.apache.org/jira/browse/KYLIN-4447) .
+- If you want to check the details or want to have a discussion, please read 
or comment on [KYLIN-4447 Kylin on kubernetes in production 
env](https://issues.apache.org/jira/browse/KYLIN-4447) .
 
-- Find provided docker image at: DockerHub: : 
[apachekylin/kylin-client](https://hub.docker.com/r/apachekylin/kylin-client)
\ No newline at end of file
+- Find the provided docker image at: DockerHub: : 
[apachekylin/kylin-client](https://hub.docker.com/r/apachekylin/kylin-client)
\ No newline at end of file
diff --git a/website/_docs/install/kylin_cluster.cn.md 
b/website/_docs/install/kylin_cluster.cn.md
index e33a411..7e8eb8e 100644
--- a/website/_docs/install/kylin_cluster.cn.md
+++ b/website/_docs/install/kylin_cluster.cn.md
@@ -37,9 +37,9 @@ 
kylin.job.lock=org.apache.kylin.storage.hbase.util.ZookeeperJobLock
 ```
 然后将所有任务和查询节点的地址注册到 `kylin.server.cluster-servers`。
 
-### 配置`CuratorScheculer`进行任务调度
+### 配置`CuratorScheduler`进行任务调度
 
-从 v3.0.0-alpha 开始，kylin引入基于Curator的主从模式多任务引擎调度器，用户可以修改如下配置来启用CuratorScheculer：
+从 v3.0.0-alpha 开始，kylin引入基于Curator的主从模式多任务引擎调度器，用户可以修改如下配置来启用CuratorScheduler：
 
 ```properties
 kylin.job.scheduler.default=100
diff --git a/website/_docs/install/kylin_cluster.md 
b/website/_docs/install/kylin_cluster.md
index 7543e5c..3da1b94 100644
--- a/website/_docs/install/kylin_cluster.md
+++ b/website/_docs/install/kylin_cluster.md
@@ -37,7 +37,7 @@ 
kylin.job.lock=org.apache.kylin.storage.hbase.util.ZookeeperJobLock
 
 Then please add all job servers and query servers to the 
`kylin.server.cluster-servers`.
 
-### Use `CuratorScheculer`
+### Use `CuratorScheduler`
 
 Since v3.0.0-alpha, kylin introduces the Leader/Follower mode multiple job 
engines scheduler based on Curator. Users can modify the following 
configuration to enable CuratorScheduler:
 
@@ -50,7 +50,7 @@ For more details about the kylin job scheduler, please refer 
to [Apache Kylin Wi
 
 ### Installing a load balancer
 
-To send query requests to a cluster instead of a single node, you can deploy a 
load balancer such as [Nginx](http://nginx.org/en/), [F5](https://www.f5.com/) 
or [cloudlb](https://rubygems.org/gems/cloudlb/), etc., so that the client and 
load balancer communication instead communicate with a specific Kylin instance.
+To send query requests to a cluster instead of a single node, you can deploy a 
load balancer such as [Nginx](http://nginx.org/en/), [F5](https://www.f5.com/) 
or [cloudlb](https://rubygems.org/gems/cloudlb/), etc., so that the client 
communicate with the load balancer instead of a specific Kylin instance.
 
 
 
@@ -58,7 +58,7 @@ To send query requests to a cluster instead of a single node, 
you can deploy a l
 
 For better stability and optimal performance, it is recommended to perform a 
read-write separation deployment, deploying Kylin on two clusters as follows:
 
-* A Hadoop cluster used to *Cube build*, which can be a large cluster shared 
with other applications;
-* An HBase cluster used to *SQL query*. Usually this cluster is configured for 
Kylin. The number of nodes does not need to be as many as Hadoop clusters. 
HBase configuration can be optimized for Kylin Cube read-only features.
+* A Hadoop cluster used for *Cube build*, which can be a large cluster shared 
with other applications;
+* An HBase cluster used for *SQL query*. Usually this cluster is configured 
for Kylin. The number of nodes does not need to be as many as Hadoop clusters. 
HBase configuration can be optimized for Kylin Cube read-only features.
 
 This deployment strategy is the best deployment solution for the production 
environment. For how to perform read-write separation deployment, please refer 
to [Deploy Apache Kylin with Standalone HBase 
Cluster](/blog/2016/06/10/standalone-hbase-cluster/) .
\ No newline at end of file

[kylin] branch document updated: Fix typos (#1553)

Reply via email to