This is an automated email from the ASF dual-hosted git repository.
xxyu pushed a commit to branch document
in repository https://gitbox.apache.org/repos/asf/kylin.git
The following commit(s) were added to refs/heads/document by this push:
new f35dfb7 Fix typos (#1553)
f35dfb7 is described below
commit f35dfb7178ef1781af420063ffea89c8eb254bda
Author: Helen Zeng <[email protected]>
AuthorDate: Fri Jan 22 20:41:30 2021 +0800
Fix typos (#1553)
* fix typos
* grammar fix
---
website/_docs/gettingstarted/faq.md | 58 +++++++++++++++----------------
website/_docs/install/Kylin_kubernetes.md | 44 +++++++++++------------
website/_docs/install/kylin_cluster.cn.md | 4 +--
website/_docs/install/kylin_cluster.md | 8 ++---
4 files changed, 57 insertions(+), 57 deletions(-)
diff --git a/website/_docs/gettingstarted/faq.md
b/website/_docs/gettingstarted/faq.md
index 9305045..5c5a403 100644
--- a/website/_docs/gettingstarted/faq.md
+++ b/website/_docs/gettingstarted/faq.md
@@ -14,15 +14,15 @@ There is an article about [how to ask a question in a smart
way](http://catb.org
#### Is Kylin a generic SQL engine for big data?
- * No, Kylin is an OLAP engine with SQL interface. The SQL queries need be
matched with the pre-defined OLAP model.
+ * No, Kylin is an OLAP engine with SQL interface. The SQL queries should be
matched with the pre-defined OLAP model.
#### What's a typical scenario to use Apache Kylin?
- * Kylin can be the best option if you have a huge table (e.g., >100 million
rows), join with lookup tables, while queries need be finished in the second
level (dashboards, interactive reports, business intelligence, etc), and the
concurrent users can be dozens or hundreds.
+ * Kylin can be the best option if you have a huge table (e.g., >100 million
rows), join with lookup tables, while queries need to be finished in the second
level (dashboards, interactive reports, business intelligence, etc), and the
concurrent users can be dozens or hundreds.
#### How large a data scale can Kylin support? How about the performance?
- * Kylin can supports second level query performance at TB to PB level
dataset. This has been verified by users like eBay, Meituan, Toutiao. Take
Meituan's case as an example (till 2018-08), 973 cubes, 3.8 million queries per
day, raw data 8.9 trillion, total cube size 971 TB (original data is bigger),
50% queries finished in < 0.5 seconds, 90% queries < 1.2 seconds.
+ * Kylin can support second level query performance at TB to PB level
dataset. This has been verified by users like eBay, Meituan, Toutiao. Take
Meituan's case as an example (till 2018-08), 973 cubes, 3.8 million queries per
day, raw data 8.9 trillion, total cube size 971 TB (original data is bigger),
50% of the queries finished in < 0.5 seconds, 90% queries < 1.2 seconds.
#### Who are using Apache Kylin?
@@ -34,11 +34,11 @@ There is an article about [how to ask a question in a smart
way](http://catb.org
#### How to compare Kylin with other SQL engines like Hive, Presto, Spark SQL,
Impala?
- * They answer a query in different ways. Kylin is not a replacement for
them, but a supplement (query accelerator). Many users run Kylin together with
other SQL engines. For the high frequent query patterns, building Cubes can
greatly improve the performance and also offload cluster workloads. For less
queried patterns or ad-hoc queries, ther MPP engines are more flexible.
+ * They answer a query in different ways. Kylin is not a replacement for
them, but a supplement (query accelerator). Many users run Kylin together with
other SQL engines. For the high frequent query patterns, building Cubes can
greatly improve the performance and also offload cluster workloads. For less
queried patterns or ad-hoc queries, other MPP engines are more flexible.
#### How to compare Kylin with Druid?
- * Druid is more suitable for real-time analysis. Kylin is more focus on OLAP
case. Druid has good integration with Kafka as real-time streaming; Kylin
fetches data from Hive or Kafka in batches. The real-time capability of Kylin
is still under development.
+ * Druid is more suitable for real-time analysis. Kylin is more focus on OLAP
case. Druid has a good integration with Kafka as real-time streaming; Kylin
fetches data from Hive or Kafka in batches. The real-time capability of Kylin
is still under development.
* Many internet service providers host both Druid and Kylin, serving
different purposes (real-time and historical).
@@ -58,13 +58,13 @@ There is an article about [how to ask a question in a smart
way](http://catb.org
#### How many dimensions can be in a cube?
- * The max physical dimension number (exclude derived column in lookup
tables) in a cube is 63; If you can normalize some dimensions to lookup tables,
with derived dimensions, you can create a cube with more than 100 dimensions.
+ * The max physical dimension number (exclude derived columns in lookup
tables) in a cube is 63; If you can normalize some dimensions to lookup tables,
with derived dimensions, you can create a cube with more than 100 dimensions.
* But a cube with > 30 physical dimensions is not recommended; You even
couldn't save that in Kylin if you don't optimize the aggregation groups.
Please search "curse of dimensionality".
-#### Why I got an error when running a "select * " query?
+#### Why do I got an error when running a "select * " query?
- * The cube only has aggregated data, so all your queries should be
aggregated queries ("GROUP BY"). You can use a SQL with all dimensions be
grouped to get them as close as the detailed result, but that is not the raw
data.
+ * The cube has only the aggregated data, so all your queries should be
aggregated queries ("GROUP BY"). You can use a SQL with all dimensions be
grouped to get them as close as the detailed result, but that is not the raw
data.
* In order to be connected from some BI tools, Kylin tries to answer "select
\*" query but please aware the result might not be expected. Please make sure
each query to Kylin is aggregated.
@@ -76,9 +76,9 @@ But if you do want, there are some workarounds. 1) Add the
primary key as a dime
#### What is the UHC dimension?
- * UHC means Ultra High Cardinality. Cardinality means the number of distinct
values of a dimension. Usually, a dimension's cardinality is from tens to
millions. If above million, we call it a UHC dimension, for example, user id,
cell number, etc.
+ * UHC means Ultra High Cardinality. Cardinality means the number of distinct
values of a dimension. Usually, a dimension's cardinality is from tens to
millions. If above a million, we call it a UHC dimension, for example, user id,
cell number, etc.
- * Kylin supports UHC dimension but you need to pay attention to UHC
dimension, especially the encoding and the cuboid combinations. It may cause
your Cube very large and query to be slow.
+ * Kylin supports UHC dimension, but you need to pay attention to UHC
dimensions, especially the encodings and the cuboid combinations. It may cause
your Cube to be very large and query to be slow.
#### Can I specify a cube to answer my SQL statements?
@@ -103,11 +103,11 @@ But if you do want, there are some workarounds. 1) Add
the primary key as a dime
#### How to encrypt cube data?
- * You can enable encryption at HBase side. Refer
https://hbase.apache.org/book.html#hbase.encryption.server for more details.
+ * You can enable encryption at HBase side. Refer to
https://hbase.apache.org/book.html#hbase.encryption.server for more details.
#### How to schedule the cube build at a fixed frequency, in an automatic way?
- * Kylin doesn't have a built-in scheduler for this. You can trigger that
through Rest API from external scheduler services, like Linux cron job, Apache
Airflow, etc.
+ * Kylin doesn't have a built-in scheduler for this. You can trigger that
through the Rest API from external scheduler services, like Linux cron job,
Apache Airflow, etc.
#### How to export/import cube/project across different Kylin environments?
@@ -123,7 +123,7 @@ But if you do want, there are some workarounds. 1) Add the
primary key as a dime
#### The Cube is ready, but why the table does not appear in the "Insight" tab?
- * Make sure the "kylin.server.cluster-servers" property in
`conf/kylin.properties` is configured with EVERY Kylin node, all job and query
nodes. Kylin nodes notify each other to flush cache with this configuration.
And please ensure the network among them are healthy.
+ * Make sure the "kylin.server.cluster-servers" property in
`conf/kylin.properties` is configured with EVERY Kylin node, all job and query
nodes. Kylin nodes notify each other to flush cache with this configuration.
Also, please ensure that the network among them are healthy.
#### What should I do if I encounter a "java.lang.NoClassDefFoundError" error?
@@ -135,7 +135,7 @@ But if you do want, there are some workarounds. 1) Add the
primary key as a dime
#### How to add dimension/measure to a cube?
- * Once a cube is built, its structure couldn't be modified. To add
dimension/measure, you need to clone a new cube, and then add in it.
+ * Once a cube is built, its structure cannot be modified. To add a
dimension/measure, you need to clone a new cube, and then add to it.
When the new cube is built, please disable or drop the old one.
@@ -183,23 +183,23 @@ kylin.engine.spark-conf.spark.yarn.queue=YOUR_QUEUE_NAME
#### How to add a new JDBC data source dialect?
- * That is easy to add a new type of JDBC data source. You can follow such
steps:
+ * It is easy to add a new type of JDBC data source. You can follow such
steps:
1) Add the dialect in
source-hive/src/main/java/org/apache/kylin/source/jdbc/JdbcDialect.java
2) Implement a new IJdbcMetadata if {database that you want to add}'s metadata
fetching is different with others and then register it in JdbcMetadataFactory
-3) You may need to customize the SQL for creating/dropping table in
JdbcExplorer for {database that you want to add}.
+3) You may need to customize the SQL for creating/dropping tables in
JdbcExplorer for {database that you want to add}.
#### How to ask a question?
- * Check Kylin documents first. and do a Google search also can help.
Sometimes the question has been answered so you don't need ask again. If no
matching, please send your question to Apache Kylin user mailing list:
[email protected]; You need to drop an email to
[email protected] to subscribe if you haven't done so. In the
email content, please provide your Kylin and Hadoop version, specific error
logs (as much as possible), and also the how to re-produce steps.
+ * Check Kylin documents first, and doing a Google search can also help.
Sometimes the question has been answered, so you don't need ask again. If no
matching, please send your question to Apache Kylin user mailing list:
[email protected]; You need to drop an email to
[email protected] to subscribe if you haven't done so. In the
email content, please provide your Kylin and Hadoop version, specific error
logs (as much as possible), and also the how to re-produce steps.
#### "bin/find-hive-dependency.sh" can locate hive/hcat jars in local, but
Kylin reports error like "java.lang.NoClassDefFoundError:
org/apache/hive/hcatalog/mapreduce/HCatInputFormat" or
"java.lang.NoClassDefFoundError: org/apache/hadoop/hive/ql/session/SessionState"
- * Kylin need many dependent jars (hadoop/hive/hcat/hbase/kafka) on classpath
to work, but Kylin doesn't ship them. It will seek these jars from your local
machine by running commands like `hbase classpath`, `hive -e set` etc. The
founded jars' path will be appended to the environment variable
*HBASE_CLASSPATH* (Kylin uses `hbase` shell command to start up, which will
read this). But in some Hadoop distribution (like AWS EMR 5.0), the `hbase`
shell doesn't keep the origin `HBASE_CLASSPA [...]
+ * Kylin needs many dependent jars (hadoop/hive/hcat/hbase/kafka) on
classpath to work, but Kylin doesn't ship them. It will seek these jars from
your local machine by running commands like `hbase classpath`, `hive -e set`
etc. The founded jars' path will be appended to the environment variable
*HBASE_CLASSPATH* (Kylin uses `hbase` shell command to start up, which will
read this). But in some Hadoop distribution (like AWS EMR 5.0), the `hbase`
shell doesn't keep the origin `HBASE_CLASSP [...]
- * To fix this, find the hbase shell script (in hbase/bin folder), and search
*HBASE_CLASSPATH*, check whether it overwrite the value like :
+ * To fix this, find the hbase shell script (in hbase/bin folder), and search
*HBASE_CLASSPATH*, check whether it overwrites the value like :
{% highlight Groff markup %}
export
HBASE_CLASSPATH=$HADOOP_CONF:$HADOOP_HOME/*:$HADOOP_HOME/lib/*:$ZOOKEEPER_HOME/*:$ZOOKEEPER_HOME/lib/*
@@ -213,7 +213,7 @@ kylin.engine.spark-conf.spark.yarn.queue=YOUR_QUEUE_NAME
#### Get "java.lang.IllegalArgumentException: Too high cardinality is not
suitable for dictionary -- cardinality: 5220674" in "Build Dimension
Dictionary" step
- * Kylin uses "Dictionary" encoding to encode/decode the dimension values
(check [this blog](/blog/2015/08/13/kylin-dictionary/)); Usually a dimension's
cardinality is less than millions, so the "Dict" encoding is good to use. As
dictionary need be persisted and loaded into memory, if a dimension's
cardinality is very high, the memory footprint will be tremendous, so Kylin add
a check on this. If you see this error, suggest to identify the UHC dimension
first and then re-evaluate the de [...]
+ * Kylin uses "Dictionary" encoding to encode/decode the dimension values
(check [this blog](/blog/2015/08/13/kylin-dictionary/)); Usually a dimension's
cardinality is less than millions, so the "Dict" encoding is good to use. As
dictionary need to be persisted and loaded into memory, if a dimension's
cardinality is very high, the memory footprint will be tremendous, so Kylin add
a check on this. If you see this error, please identify the UHC dimension first
and then re-evaluate the des [...]
#### How to Install Kylin on CDH 5.2 or Hadoop 2.5.x
@@ -221,7 +221,7 @@ kylin.engine.spark-conf.spark.yarn.queue=YOUR_QUEUE_NAME
* Check out discussion:
[https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/kylin-olap/X0GZfsX1jLc/nzs6xAhNpLkJ](https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/kylin-olap/X0GZfsX1jLc/nzs6xAhNpLkJ)
{% highlight Groff markup %}
- I was able to deploy Kylin with following option in POM.
+ I was able to deploy Kylin with the following options in POM.
<hadoop2.version>2.5.0</hadoop2.version>
<yarn.version>2.5.0</yarn.version>
<hbase-hadoop2.version>0.98.6-hadoop2</hbase-hadoop2.version>
@@ -232,13 +232,13 @@ kylin.engine.spark-conf.spark.yarn.queue=YOUR_QUEUE_NAME
#### SUM(field) returns a negative result while all the numbers in this field
are > 0
- * If a column is declared as integer in Hive, the SQL engine (calcite) will
use column's type (integer) as the data type for "SUM(field)", while the
aggregated value on this field may exceed the scope of integer; in that case
the cast will cause a negtive value be returned; The workaround is, alter that
column's type to BIGINT in hive, and then sync the table schema to Kylin (the
cube doesn't need rebuild); Keep in mind that, always declare as BIGINT in hive
for an integer column which [...]
+ * If a column is declared as integer in Hive, the SQL engine (calcite) will
use column's type (integer) as the data type for "SUM(field)". While the
aggregated value on this field may exceed the scope of integer, the cast will
cause a negative value be returned. The workaround is, alter that column's type
to BIGINT in hive, and then sync the table schema to Kylin (the cube doesn't
need rebuild); Keep in mind that, always declare as BIGINT in hive for an
integer column which would be us [...]
-#### Why Kylin need extract the distinct columns from Fact Table before
building cube?
- * Kylin uses dictionary to encode the values in each column, this greatly
reduce the cube's storage size. To build the dictionary, Kylin need fetch the
distinct values for each column.
+#### Why does Kylin need to extract the distinct columns from Fact Table
before building cube?
+ * Kylin uses dictionary to encode the values in each column, this
significantly reduces the cube's storage size. To build the dictionary, Kylin
needs to fetch the distinct values for each column.
-#### Why Kylin calculate the HIVE table cardinality?
- * The cardinality of dimensions is an important measure of cube complexity.
The higher the cardinality, the bigger the cube, and thus the longer to build
and the slower to query. Cardinality > 1,000 is worth attention and > 1,000,000
should be avoided at best effort. For optimal cube performance, try reduce high
cardinality by categorize values or derive features.
+#### Why does Kylin calculate the HIVE table cardinality?
+ * The cardinality of dimensions is an important measure of the cube
complexity. The higher the cardinality, the bigger the cube, and thus the
longer to build, and the slower to query. Cardinality > 1,000 is worth
attention and > 1,000,000 should be avoided at best effort. For optimal cube
performance, try to reduce high cardinality by categorize values or derive
features.
#### How to add new user or change the default password?
* Please check the document: [How to add new user or change the default
password](https://cwiki.apache.org/confluence/display/KYLIN/How+to+add+new+user+or+change+the+default+password).
@@ -281,18 +281,18 @@ group by a.slr_sgmt
[http://npm.taobao.org](http://npm.taobao.org)
#### Failed to run BuildCubeWithEngineTest, saying failed to connect to hbase
while hbase is active
- * User may get this error when first time run hbase client, please check the
error trace to see whether there is an error saying couldn't access a folder
like "/hadoop/hbase/local/jars"; If that folder doesn't exist, create it.
+ * User may get this error when running hbase client the first time, please
check the error trace to see whether there is an error saying couldn't access a
folder like "/hadoop/hbase/local/jars"; If that folder doesn't exist, create it.
#### Kylin JDBC driver returns a different Date/time than the REST API, seems
it add the timezone to parse the date.
* Please check the [post in mailing
list](http://apache-kylin.74782.x6.nabble.com/JDBC-query-result-Date-column-get-wrong-value-td5370.html)
-#### What kind of data be left in 'kylin.env.hdfs-working-dir' ? We often
execute kylin cleanup storage command, but now our working dir folder is about
300 GB size, can we delete old data manually?
+#### What kind of data is left in 'kylin.env.hdfs-working-dir' ? We often
execute kylin cleanup storage command, but now our working dir folder is about
300 GB size, can we delete old data manually?
* The data in 'hdfs-working-dir' ('hdfs:///kylin/kylin_metadata/' by
default) includes intermediate files (will be GC) and Cuboid data (won't be
GC). The Cuboid data is kept for the further segments' merge, as Kylin couldn't
merge from HBase. If you're sure those segments won't be merged, you can move
them to other paths or even delete.
* Please pay attention to the "resources" or "jdbc-resources" sub-folder
under '/kylin/kylin_metadata/', which persists big metadata files like
dictionaries and lookup tables' snapshots. They shouldn't be manually moved.
#### How to escape the key word in fuzzy match (like) queries?
-"%", "_" are key words in the "like" clause; "%" matches any character, and
"_" matches a single character; When you wants to match the key word like "_",
need to escape them with another character ahead; Below is a sample with "/" to
escape, the query is to match the "xiao_":
+"%", "_" are keywords in the "like" clause; "%" matches any character, and "_"
matches a single character; When you want to match the keyword like "_", you
need to escape them with another character ahead; Below is a sample with "/" to
escape, the query is to match the "xiao_":
"select username from gg_user where username like '%xiao/_%' escape '/'; "
\ No newline at end of file
diff --git a/website/_docs/install/Kylin_kubernetes.md
b/website/_docs/install/Kylin_kubernetes.md
index 0508963..e6b8e8a 100644
--- a/website/_docs/install/Kylin_kubernetes.md
+++ b/website/_docs/install/Kylin_kubernetes.md
@@ -6,9 +6,9 @@ permalink: /docs/install/kylin_on_kubernetes.html
since: v3.0.2
---
-Kubernetes is a portable, extensible, open-source platform for managing
containerized workloads and services, that facilitates both declarative
configuration and automation. It has a large, rapidly growing ecosystem.
Kubernetes services, support, and tools are widely available.
+Kubernetes is a portable, extensible, open-source platform for managing
containerized workloads and services, it facilitates both declarative
configuration and automation. It has a large, rapidly growing ecosystem.
Kubernetes services, support and tools are widely available.
-Apache Kylin is a open source, distributed analytical data warehouse for big
data. Deploy Kylin on Kubernetes cluster, will reduce cost of maintenance and
extension.
+Apache Kylin is an open source, distributed analytical data warehouse for big
data. Deploy Kylin on Kubernetes cluster will reduce cost of maintenance and
extension.
## Directory
Visit and download https://github.com/apache/kylin/tree/master/kubernetes and
you will find three directory:
@@ -16,11 +16,11 @@ Visit and download
https://github.com/apache/kylin/tree/master/kubernetes and yo
- **config**
Please update your configuration file here.
- **template**
- This directory provided two deployment templates, one for quick-start
purpose, another for production/distributed deployment.
- - Quick-start template is for one node deployment with an ALL kylin
instance.
- - Production template is for multi-nodes deployment with a few of
job/query kylin instances; and some other service like memcached and
filebeat(check doc at [ELK stack](https://www.elastic.co/what-is/elk-stack))
will help to satisfy log collection/query cache/session sharing demand.
+ This directory provides two deployment templates, one for quick-start
purpose, another for production/distributed deployment.
+ - The quick-start template is for one node deployment with an ALL kylin
instance.
+ - The production template is for multi-nodes deployment with a few
job/query kylin instances. Moreover, some other services like memcached and
filebeat(check doc at [ELK stack](https://www.elastic.co/what-is/elk-stack))
will help to satisfy log collection/query cache/session sharing demand.
- **docker**
- Docker image is the pre-requirement of Kylin on Kubernetes, please check this
directory if you need build it yourself. For CDH5.x user, you may consider use
a provided image on DockerHub.
+ Docker image is the pre-requirement of Kylin on Kubernetes, please check this
directory if you need to build it yourself. For CDH5.x user, you may consider
using a provided image on DockerHub.
---
@@ -29,45 +29,45 @@ Visit and download
https://github.com/apache/kylin/tree/master/kubernetes and yo
1. A hadoop cluster.
2. A K8s cluster, with sufficient system resources.
3. **kylin-client** image.
-4. A Elasticsearch cluster(maybe optional).
+4. An Elasticsearch cluster(maybe optional).
## How to build docker image
### Hadoop-client image
-What is hadoop-client docker image and why we need this?
+What is a hadoop-client docker image and why do we need this?
-As we all know, the node you want to deploy Kylin, should contains Hadoop
dependency(jars and configuration files), these dependency let you have access
to Hadoop Service, such as HDFS, HBase, Hive, which are needed by Apache Kylin.
Unfortunately, each Hadoop distribution(CHD or HDP etc.) has its own specific
jars. So, we can build specific image for specific Hadoop distribution, which
will make image management task more easier. This will have following two
benefits:
+As we all know, the node you want to deploy Kylin should contain Hadoop
dependencies(jars and configuration files), these dependencies let you have
access to Hadoop Services, such as HDFS, HBase, Hive, which are needed by
Apache Kylin. Unfortunately, each Hadoop distribution(CHD or HDP etc.) has its
own specific jars. So, we can build specific images for specific Hadoop
distributions, which will make image management task easier. This will have the
following two benefits:
-- Someone who has better knowledge on Hadoop can do this work, and let kylin
user build their Kylin image base on provided Hadoop-Client image.
+- Someone who has more knowledge on Hadoop can do this work, and let kylin
users build their Kylin image base on provided Hadoop-Client image.
- Upgrade Kylin will be much easier.
Build Step
-- Prepare and modify Dockerfile(If you are using other hadoop distribution,
please consider build image yourself).
+- Prepare and modify Dockerfile(If you are using other hadoop distribution,
please consider build an image yourself).
- Place Spark binary(such as `spark-2.3.2-bin-hadoop2.7.tgz`) into dir
`provided-binary`.
-- Run `build-image.sh` to build image.
+- Run `build-image.sh` to build the image.
### Kylin-client image
-What is kylin-client docker images?
+What is a kylin-client docker image?
-**kylin-client** is a docker image which based on **hadoop-client**, it will
provided the flexibility of upgrade of Apache Kylin.
+**kylin-client** is a docker image which based on **hadoop-client**, it will
provide the flexibility of upgrade of Apache Kylin.
Build Step
- Place Kylin binary(such as `apache-kylin-3.0.1-bin-cdh57.tar.gz`) and
uncompress it into current dir.
-- Modify `Dockerfile` , change the value of `KYLIN_VERSION` and name of base
image(hadoop-client).
+- Modify `Dockerfile` , change the value of `KYLIN_VERSION` and the name of
base image(hadoop-client).
- Run `build-image.sh` to build image.
----
## How to deploy kylin on kubernetes
-Here let's take a look of how to deploy a kylin cluster which connect to CDH
5.7.
+Here let's take a look at how to deploy a kylin cluster which connects to CDH
5.7.
1 `kubenetes/template/production/example/deployment` is the working directory.
-2 Update hadoop configuration files
(`kubenetes/template/production/example/config/hadoop`) and filebeat 's
configuration file.
+2 Update hadoop configuration files
(`kubenetes/template/production/example/config/hadoop`) and filebeat's
configuration file.
3 Create statefulset and service for memcached.
@@ -79,7 +79,7 @@ service/cache-svc created
statefulset.apps/kylin-memcached created
```
-- Check hostname of cache service.
+- Check the hostname of cache service.
```
$ kubectl run -it--image=busybox:1.28.4--rm--restart=Never sh -n test-dns
@@ -105,7 +105,7 @@ $ vim ../config/kylin-job/kylin.properties
$ vim ../config/kylin-query/kylin.properties
```
-- Create configMap
+- Create the configMap
```
$ kubectl create configmap -n kylin-example hadoop-config \
@@ -206,8 +206,8 @@ $ kubectl exec -it kylin-job-0 -n kylin-example-- bash
$ kubectl get pod kylin-job-0 -n kylin-example -o yaml
```
-- If you don't have a Elasticsearch cluster or not interested in log
collection, please remove filebeat container in both kylin-query-stateful.yaml
and kylin-job-stateful.yaml.
+- If you don't have an Elasticsearch cluster or not interested in log
collection, please remove filebeat container in both kylin-query-stateful.yaml
and kylin-job-stateful.yaml.
-- If you want to check detail or want to have a discussion, please read or
comment on [KYLIN-4447 Kylin on kubernetes in production
env](https://issues.apache.org/jira/browse/KYLIN-4447) .
+- If you want to check the details or want to have a discussion, please read
or comment on [KYLIN-4447 Kylin on kubernetes in production
env](https://issues.apache.org/jira/browse/KYLIN-4447) .
-- Find provided docker image at: DockerHub: :
[apachekylin/kylin-client](https://hub.docker.com/r/apachekylin/kylin-client)
\ No newline at end of file
+- Find the provided docker image at: DockerHub: :
[apachekylin/kylin-client](https://hub.docker.com/r/apachekylin/kylin-client)
\ No newline at end of file
diff --git a/website/_docs/install/kylin_cluster.cn.md
b/website/_docs/install/kylin_cluster.cn.md
index e33a411..7e8eb8e 100644
--- a/website/_docs/install/kylin_cluster.cn.md
+++ b/website/_docs/install/kylin_cluster.cn.md
@@ -37,9 +37,9 @@
kylin.job.lock=org.apache.kylin.storage.hbase.util.ZookeeperJobLock
```
然后将所有任务和查询节点的地址注册到 `kylin.server.cluster-servers`。
-### 配置`CuratorScheculer`进行任务调度
+### 配置`CuratorScheduler`进行任务调度
-从 v3.0.0-alpha 开始,kylin引入基于Curator的主从模式多任务引擎调度器,用户可以修改如下配置来启用CuratorScheculer:
+从 v3.0.0-alpha 开始,kylin引入基于Curator的主从模式多任务引擎调度器,用户可以修改如下配置来启用CuratorScheduler:
```properties
kylin.job.scheduler.default=100
diff --git a/website/_docs/install/kylin_cluster.md
b/website/_docs/install/kylin_cluster.md
index 7543e5c..3da1b94 100644
--- a/website/_docs/install/kylin_cluster.md
+++ b/website/_docs/install/kylin_cluster.md
@@ -37,7 +37,7 @@
kylin.job.lock=org.apache.kylin.storage.hbase.util.ZookeeperJobLock
Then please add all job servers and query servers to the
`kylin.server.cluster-servers`.
-### Use `CuratorScheculer`
+### Use `CuratorScheduler`
Since v3.0.0-alpha, kylin introduces the Leader/Follower mode multiple job
engines scheduler based on Curator. Users can modify the following
configuration to enable CuratorScheduler:
@@ -50,7 +50,7 @@ For more details about the kylin job scheduler, please refer
to [Apache Kylin Wi
### Installing a load balancer
-To send query requests to a cluster instead of a single node, you can deploy a
load balancer such as [Nginx](http://nginx.org/en/), [F5](https://www.f5.com/)
or [cloudlb](https://rubygems.org/gems/cloudlb/), etc., so that the client and
load balancer communication instead communicate with a specific Kylin instance.
+To send query requests to a cluster instead of a single node, you can deploy a
load balancer such as [Nginx](http://nginx.org/en/), [F5](https://www.f5.com/)
or [cloudlb](https://rubygems.org/gems/cloudlb/), etc., so that the client
communicate with the load balancer instead of a specific Kylin instance.
@@ -58,7 +58,7 @@ To send query requests to a cluster instead of a single node,
you can deploy a l
For better stability and optimal performance, it is recommended to perform a
read-write separation deployment, deploying Kylin on two clusters as follows:
-* A Hadoop cluster used to *Cube build*, which can be a large cluster shared
with other applications;
-* An HBase cluster used to *SQL query*. Usually this cluster is configured for
Kylin. The number of nodes does not need to be as many as Hadoop clusters.
HBase configuration can be optimized for Kylin Cube read-only features.
+* A Hadoop cluster used for *Cube build*, which can be a large cluster shared
with other applications;
+* An HBase cluster used for *SQL query*. Usually this cluster is configured
for Kylin. The number of nodes does not need to be as many as Hadoop clusters.
HBase configuration can be optimized for Kylin Cube read-only features.
This deployment strategy is the best deployment solution for the production
environment. For how to perform read-write separation deployment, please refer
to [Deploy Apache Kylin with Standalone HBase
Cluster](/blog/2016/06/10/standalone-hbase-cluster/) .
\ No newline at end of file