Author: lidong
Date: Fri Jan 22 14:12:21 2021
New Revision: 1885801
URL: http://svn.apache.org/viewvc?rev=1885801&view=rev
Log:
Fix typos (#1553)
Modified:
kylin/site/cn/docs/install/kylin_cluster.html
kylin/site/docs/gettingstarted/faq.html
kylin/site/docs/install/kylin_cluster.html
kylin/site/docs/install/kylin_on_kubernetes.html
kylin/site/feed.xml
Modified: kylin/site/cn/docs/install/kylin_cluster.html
URL:
http://svn.apache.org/viewvc/kylin/site/cn/docs/install/kylin_cluster.html?rev=1885801&r1=1885800&r2=1885801&view=diff
==============================================================================
--- kylin/site/cn/docs/install/kylin_cluster.html (original)
+++ kylin/site/cn/docs/install/kylin_cluster.html Fri Jan 22 14:12:21 2021
@@ -198,9 +198,9 @@ var _hmt = _hmt || [];
</div>
<p>ç¶åå°ææä»»å¡åæ¥è¯¢èç¹çå°å注åå° <code
class="highlighter-rouge">kylin.server.cluster-servers</code>ã</p>
-<h3 id="curatorscheculer">é
ç½®<code
class="highlighter-rouge">CuratorScheculer</code>è¿è¡ä»»å¡è°åº¦</h3>
+<h3 id="curatorscheduler">é
ç½®<code
class="highlighter-rouge">CuratorScheduler</code>è¿è¡ä»»å¡è°åº¦</h3>
-<p>ä» v3.0.0-alpha å¼å§ï¼kylinå¼å
¥åºäºCuratorç䏻仿¨¡å¼å¤ä»»å¡å¼æè°åº¦å¨ï¼ç¨æ·å¯ä»¥ä¿®æ¹å¦ä¸é
ç½®æ¥å¯ç¨CuratorScheculerï¼</p>
+<p>ä» v3.0.0-alpha å¼å§ï¼kylinå¼å
¥åºäºCuratorç䏻仿¨¡å¼å¤ä»»å¡å¼æè°åº¦å¨ï¼ç¨æ·å¯ä»¥ä¿®æ¹å¦ä¸é
ç½®æ¥å¯ç¨CuratorSchedulerï¼</p>
<div class="highlighter-rouge"><pre class="highlight"><code><span
class="py">kylin.job.scheduler.default</span><span class="p">=</span><span
class="s">100</span>
<span class="err">kylin.server.self-discovery-</span><span
class="py">enabled</span><span class="p">=</span><span class="s">true</span>
Modified: kylin/site/docs/gettingstarted/faq.html
URL:
http://svn.apache.org/viewvc/kylin/site/docs/gettingstarted/faq.html?rev=1885801&r1=1885800&r2=1885801&view=diff
==============================================================================
--- kylin/site/docs/gettingstarted/faq.html (original)
+++ kylin/site/docs/gettingstarted/faq.html Fri Jan 22 14:12:21 2021
@@ -8842,19 +8842,19 @@ There is an article about <a href="http:
<h4 id="is-kylin-a-generic-sql-engine-for-big-data">Is Kylin a generic SQL
engine for big data?</h4>
<ul>
- <li>No, Kylin is an OLAP engine with SQL interface. The SQL queries need be
matched with the pre-defined OLAP model.</li>
+ <li>No, Kylin is an OLAP engine with SQL interface. The SQL queries should
be matched with the pre-defined OLAP model.</li>
</ul>
<h4 id="whats-a-typical-scenario-to-use-apache-kylin">Whatâs a typical
scenario to use Apache Kylin?</h4>
<ul>
- <li>Kylin can be the best option if you have a huge table (e.g., >100
million rows), join with lookup tables, while queries need be finished in the
second level (dashboards, interactive reports, business intelligence, etc), and
the concurrent users can be dozens or hundreds.</li>
+ <li>Kylin can be the best option if you have a huge table (e.g., >100
million rows), join with lookup tables, while queries need to be finished in
the second level (dashboards, interactive reports, business intelligence, etc),
and the concurrent users can be dozens or hundreds.</li>
</ul>
<h4
id="how-large-a-data-scale-can-kylin-support-how-about-the-performance">How
large a data scale can Kylin support? How about the performance?</h4>
<ul>
- <li>Kylin can supports second level query performance at TB to PB level
dataset. This has been verified by users like eBay, Meituan, Toutiao. Take
Meituanâs case as an example (till 2018-08), 973 cubes, 3.8 million queries
per day, raw data 8.9 trillion, total cube size 971 TB (original data is
bigger), 50% queries finished in < 0.5 seconds, 90% queries < 1.2
seconds.</li>
+ <li>Kylin can support second level query performance at TB to PB level
dataset. This has been verified by users like eBay, Meituan, Toutiao. Take
Meituanâs case as an example (till 2018-08), 973 cubes, 3.8 million queries
per day, raw data 8.9 trillion, total cube size 971 TB (original data is
bigger), 50% of the queries finished in < 0.5 seconds, 90% queries < 1.2
seconds.</li>
</ul>
<h4 id="who-are-using-apache-kylin">Who are using Apache Kylin?</h4>
@@ -8872,14 +8872,14 @@ There is an article about <a href="http:
<h4
id="how-to-compare-kylin-with-other-sql-engines-like-hive-presto-spark-sql-impala">How
to compare Kylin with other SQL engines like Hive, Presto, Spark SQL,
Impala?</h4>
<ul>
- <li>They answer a query in different ways. Kylin is not a replacement for
them, but a supplement (query accelerator). Many users run Kylin together with
other SQL engines. For the high frequent query patterns, building Cubes can
greatly improve the performance and also offload cluster workloads. For less
queried patterns or ad-hoc queries, ther MPP engines are more flexible.</li>
+ <li>They answer a query in different ways. Kylin is not a replacement for
them, but a supplement (query accelerator). Many users run Kylin together with
other SQL engines. For the high frequent query patterns, building Cubes can
greatly improve the performance and also offload cluster workloads. For less
queried patterns or ad-hoc queries, other MPP engines are more flexible.</li>
</ul>
<h4 id="how-to-compare-kylin-with-druid">How to compare Kylin with Druid?</h4>
<ul>
<li>
- <p>Druid is more suitable for real-time analysis. Kylin is more focus on
OLAP case. Druid has good integration with Kafka as real-time streaming; Kylin
fetches data from Hive or Kafka in batches. The real-time capability of Kylin
is still under development.</p>
+ <p>Druid is more suitable for real-time analysis. Kylin is more focus on
OLAP case. Druid has a good integration with Kafka as real-time streaming;
Kylin fetches data from Hive or Kafka in batches. The real-time capability of
Kylin is still under development.</p>
</li>
<li>
<p>Many internet service providers host both Druid and Kylin, serving
different purposes (real-time and historical).</p>
@@ -8913,18 +8913,18 @@ There is an article about <a href="http:
<ul>
<li>
- <p>The max physical dimension number (exclude derived column in lookup
tables) in a cube is 63; If you can normalize some dimensions to lookup tables,
with derived dimensions, you can create a cube with more than 100
dimensions.</p>
+ <p>The max physical dimension number (exclude derived columns in lookup
tables) in a cube is 63; If you can normalize some dimensions to lookup tables,
with derived dimensions, you can create a cube with more than 100
dimensions.</p>
</li>
<li>
<p>But a cube with > 30 physical dimensions is not recommended; You
even couldnât save that in Kylin if you donât optimize the aggregation
groups. Please search âcurse of dimensionalityâ.</p>
</li>
</ul>
-<h4 id="why-i-got-an-error-when-running-a-select---query">Why I got an error
when running a âselect * â query?</h4>
+<h4 id="why-do-i-got-an-error-when-running-a-select---query">Why do I got an
error when running a âselect * â query?</h4>
<ul>
<li>
- <p>The cube only has aggregated data, so all your queries should be
aggregated queries (âGROUP BYâ). You can use a SQL with all dimensions be
grouped to get them as close as the detailed result, but that is not the raw
data.</p>
+ <p>The cube has only the aggregated data, so all your queries should be
aggregated queries (âGROUP BYâ). You can use a SQL with all dimensions be
grouped to get them as close as the detailed result, but that is not the raw
data.</p>
</li>
<li>
<p>In order to be connected from some BI tools, Kylin tries to answer
âselect *â query but please aware the result might not be expected. Please
make sure each query to Kylin is aggregated.</p>
@@ -8943,10 +8943,10 @@ There is an article about <a href="http:
<ul>
<li>
- <p>UHC means Ultra High Cardinality. Cardinality means the number of
distinct values of a dimension. Usually, a dimensionâs cardinality is from
tens to millions. If above million, we call it a UHC dimension, for example,
user id, cell number, etc.</p>
+ <p>UHC means Ultra High Cardinality. Cardinality means the number of
distinct values of a dimension. Usually, a dimensionâs cardinality is from
tens to millions. If above a million, we call it a UHC dimension, for example,
user id, cell number, etc.</p>
</li>
<li>
- <p>Kylin supports UHC dimension but you need to pay attention to UHC
dimension, especially the encoding and the cuboid combinations. It may cause
your Cube very large and query to be slow.</p>
+ <p>Kylin supports UHC dimension, but you need to pay attention to UHC
dimensions, especially the encodings and the cuboid combinations. It may cause
your Cube to be very large and query to be slow.</p>
</li>
</ul>
@@ -8984,13 +8984,13 @@ There is an article about <a href="http:
<h4 id="how-to-encrypt-cube-data">How to encrypt cube data?</h4>
<ul>
- <li>You can enable encryption at HBase side. Refer
https://hbase.apache.org/book.html#hbase.encryption.server for more
details.</li>
+ <li>You can enable encryption at HBase side. Refer to
https://hbase.apache.org/book.html#hbase.encryption.server for more
details.</li>
</ul>
<h4
id="how-to-schedule-the-cube-build-at-a-fixed-frequency-in-an-automatic-way">How
to schedule the cube build at a fixed frequency, in an automatic way?</h4>
<ul>
- <li>Kylin doesnât have a built-in scheduler for this. You can trigger that
through Rest API from external scheduler services, like Linux cron job, Apache
Airflow, etc.</li>
+ <li>Kylin doesnât have a built-in scheduler for this. You can trigger that
through the Rest API from external scheduler services, like Linux cron job,
Apache Airflow, etc.</li>
</ul>
<h4
id="how-to-exportimport-cubeproject-across-different-kylin-environments">How to
export/import cube/project across different Kylin environments?</h4>
@@ -9014,7 +9014,7 @@ There is an article about <a href="http:
<h4
id="the-cube-is-ready-but-why-the-table-does-not-appear-in-the-insight-tab">The
Cube is ready, but why the table does not appear in the âInsightâ tab?</h4>
<ul>
- <li>Make sure the âkylin.server.cluster-serversâ property in <code
class="highlighter-rouge">conf/kylin.properties</code> is configured with EVERY
Kylin node, all job and query nodes. Kylin nodes notify each other to flush
cache with this configuration. And please ensure the network among them are
healthy.</li>
+ <li>Make sure the âkylin.server.cluster-serversâ property in <code
class="highlighter-rouge">conf/kylin.properties</code> is configured with EVERY
Kylin node, all job and query nodes. Kylin nodes notify each other to flush
cache with this configuration. Also, please ensure that the network among them
are healthy.</li>
</ul>
<h4
id="what-should-i-do-if-i-encounter-a-javalangnoclassdeffounderror-error">What
should I do if I encounter a âjava.lang.NoClassDefFoundErrorâ error?</h4>
@@ -9032,7 +9032,7 @@ There is an article about <a href="http:
<h4 id="how-to-add-dimensionmeasure-to-a-cube">How to add dimension/measure to
a cube?</h4>
<ul>
- <li>Once a cube is built, its structure couldnât be modified. To add
dimension/measure, you need to clone a new cube, and then add in it.</li>
+ <li>Once a cube is built, its structure cannot be modified. To add a
dimension/measure, you need to clone a new cube, and then add to it.</li>
</ul>
<p>When the new cube is built, please disable or drop the old one.</p>
@@ -9094,29 +9094,29 @@ kylin.engine.spark-conf.spark.yarn.queue
<h4 id="how-to-add-a-new-jdbc-data-source-dialect">How to add a new JDBC data
source dialect?</h4>
<ul>
- <li>That is easy to add a new type of JDBC data source. You can follow such
steps:</li>
+ <li>It is easy to add a new type of JDBC data source. You can follow such
steps:</li>
</ul>
<p>1) Add the dialect in
source-hive/src/main/java/org/apache/kylin/source/jdbc/JdbcDialect.java</p>
<p>2) Implement a new IJdbcMetadata if {database that you want to add}âs
metadata fetching is different with others and then register it in
JdbcMetadataFactory</p>
-<p>3) You may need to customize the SQL for creating/dropping table in
JdbcExplorer for {database that you want to add}.</p>
+<p>3) You may need to customize the SQL for creating/dropping tables in
JdbcExplorer for {database that you want to add}.</p>
<h4 id="how-to-ask-a-question">How to ask a question?</h4>
<ul>
- <li>Check Kylin documents first. and do a Google search also can help.
Sometimes the question has been answered so you donât need ask again. If no
matching, please send your question to Apache Kylin user mailing list:
[email protected]; You need to drop an email to
[email protected] to subscribe if you havenât done so. In the
email content, please provide your Kylin and Hadoop version, specific error
logs (as much as possible), and also the how to re-produce steps.</li>
+ <li>Check Kylin documents first, and doing a Google search can also help.
Sometimes the question has been answered, so you donât need ask again. If no
matching, please send your question to Apache Kylin user mailing list:
[email protected]; You need to drop an email to
[email protected] to subscribe if you havenât done so. In the
email content, please provide your Kylin and Hadoop version, specific error
logs (as much as possible), and also the how to re-produce steps.</li>
</ul>
<h4
id="binfind-hive-dependencysh-can-locate-hivehcat-jars-in-local-but-kylin-reports-error-like-javalangnoclassdeffounderror-orgapachehivehcatalogmapreducehcatinputformat-or-javalangnoclassdeffounderror-orgapachehadoophiveqlsessionsessionstate">âbin/find-hive-dependency.shâ
can locate hive/hcat jars in local, but Kylin reports error like
âjava.lang.NoClassDefFoundError:
org/apache/hive/hcatalog/mapreduce/HCatInputFormatâ or
âjava.lang.NoClassDefFoundError:
org/apache/hadoop/hive/ql/session/SessionStateâ</h4>
<ul>
<li>
- <p>Kylin need many dependent jars (hadoop/hive/hcat/hbase/kafka) on
classpath to work, but Kylin doesnât ship them. It will seek these jars from
your local machine by running commands like <code
class="highlighter-rouge">hbase classpath</code>, <code
class="highlighter-rouge">hive -e set</code> etc. The founded jarsâ path will
be appended to the environment variable <em>HBASE_CLASSPATH</em> (Kylin uses
<code class="highlighter-rouge">hbase</code> shell command to start up, which
will read this). But in some Hadoop distribution (like AWS EMR 5.0), the <code
class="highlighter-rouge">hbase</code> shell doesnât keep the origin <code
class="highlighter-rouge">HBASE_CLASSPATH</code> value, that causes the
âNoClassDefFoundErrorâ.</p>
+ <p>Kylin needs many dependent jars (hadoop/hive/hcat/hbase/kafka) on
classpath to work, but Kylin doesnât ship them. It will seek these jars from
your local machine by running commands like <code
class="highlighter-rouge">hbase classpath</code>, <code
class="highlighter-rouge">hive -e set</code> etc. The founded jarsâ path will
be appended to the environment variable <em>HBASE_CLASSPATH</em> (Kylin uses
<code class="highlighter-rouge">hbase</code> shell command to start up, which
will read this). But in some Hadoop distribution (like AWS EMR 5.0), the <code
class="highlighter-rouge">hbase</code> shell doesnât keep the origin <code
class="highlighter-rouge">HBASE_CLASSPATH</code> value, that causes the
âNoClassDefFoundErrorâ.</p>
</li>
<li>
- <p>To fix this, find the hbase shell script (in hbase/bin folder), and
search <em>HBASE_CLASSPATH</em>, check whether it overwrite the value like :</p>
+ <p>To fix this, find the hbase shell script (in hbase/bin folder), and
search <em>HBASE_CLASSPATH</em>, check whether it overwrites the value like
:</p>
</li>
</ul>
@@ -9131,7 +9131,7 @@ kylin.engine.spark-conf.spark.yarn.queue
<h4
id="get-javalangillegalargumentexception-too-high-cardinality-is-not-suitable-for-dictionary----cardinality-5220674-in-build-dimension-dictionary-step">Get
âjava.lang.IllegalArgumentException: Too high cardinality is not suitable
for dictionary â cardinality: 5220674â in âBuild Dimension Dictionaryâ
step</h4>
<ul>
- <li>Kylin uses âDictionaryâ encoding to encode/decode the dimension
values (check <a href="/blog/2015/08/13/kylin-dictionary/">this blog</a>);
Usually a dimensionâs cardinality is less than millions, so the âDictâ
encoding is good to use. As dictionary need be persisted and loaded into
memory, if a dimensionâs cardinality is very high, the memory footprint will
be tremendous, so Kylin add a check on this. If you see this error, suggest to
identify the UHC dimension first and then re-evaluate the design (whether need
to make that as dimension?). If must keep it, you can by-pass this error with
couple ways: 1) change to use other encoding (like <code
class="highlighter-rouge">fixed_length</code>, <code
class="highlighter-rouge">integer</code>) 2) or set a bigger value for <code
class="highlighter-rouge">kylin.dictionary.max.cardinality</code> in <code
class="highlighter-rouge">conf/kylin.properties</code>.</li>
+ <li>Kylin uses âDictionaryâ encoding to encode/decode the dimension
values (check <a href="/blog/2015/08/13/kylin-dictionary/">this blog</a>);
Usually a dimensionâs cardinality is less than millions, so the âDictâ
encoding is good to use. As dictionary need to be persisted and loaded into
memory, if a dimensionâs cardinality is very high, the memory footprint will
be tremendous, so Kylin add a check on this. If you see this error, please
identify the UHC dimension first and then re-evaluate the design (whether
itâs needed to make that as a dimension?). If it must be kept, you can
by-pass this error with a couple ways: 1) use other encodings (like <code
class="highlighter-rouge">fixed_length</code>, <code
class="highlighter-rouge">integer</code>) 2) or set a bigger value for <code
class="highlighter-rouge">kylin.dictionary.max.cardinality</code> in <code
class="highlighter-rouge">conf/kylin.properties</code>.</li>
</ul>
<h4 id="how-to-install-kylin-on-cdh-52-or-hadoop-25x">How to Install Kylin on
CDH 5.2 or Hadoop 2.5.x</h4>
@@ -9140,7 +9140,7 @@ kylin.engine.spark-conf.spark.yarn.queue
<li>Check out discussion: <a
href="https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/kylin-olap/X0GZfsX1jLc/nzs6xAhNpLkJ">https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/kylin-olap/X0GZfsX1jLc/nzs6xAhNpLkJ</a></li>
</ul>
-<div class="highlight"><pre><code class="language-groff" data-lang="groff">I
was able to deploy Kylin with following option in POM.
+<div class="highlight"><pre><code class="language-groff" data-lang="groff">I
was able to deploy Kylin with the following options in POM.
<hadoop2.version>2.5.0</hadoop2.version>
<yarn.version>2.5.0</yarn.version>
<hbase-hadoop2.version>0.98.6-hadoop2</hbase-hadoop2.version>
@@ -9150,17 +9150,17 @@ kylin.engine.spark-conf.spark.yarn.queue
<h4
id="sumfield-returns-a-negative-result-while-all-the-numbers-in-this-field-are--0">SUM(field)
returns a negative result while all the numbers in this field are > 0</h4>
<ul>
- <li>If a column is declared as integer in Hive, the SQL engine (calcite)
will use columnâs type (integer) as the data type for âSUM(field)â, while
the aggregated value on this field may exceed the scope of integer; in that
case the cast will cause a negtive value be returned; The workaround is, alter
that columnâs type to BIGINT in hive, and then sync the table schema to Kylin
(the cube doesnât need rebuild); Keep in mind that, always declare as BIGINT
in hive for an integer column which would be used as a measure in Kylin; See
hive number types: <a
href="https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-NumericTypes">https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-NumericTypes</a></li>
+ <li>If a column is declared as integer in Hive, the SQL engine (calcite)
will use columnâs type (integer) as the data type for âSUM(field)â. While
the aggregated value on this field may exceed the scope of integer, the cast
will cause a negative value be returned. The workaround is, alter that
columnâs type to BIGINT in hive, and then sync the table schema to Kylin (the
cube doesnât need rebuild); Keep in mind that, always declare as BIGINT in
hive for an integer column which would be used as a measure in Kylin; See hive
number types: <a
href="https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-NumericTypes">https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-NumericTypes</a></li>
</ul>
-<h4
id="why-kylin-need-extract-the-distinct-columns-from-fact-table-before-building-cube">Why
Kylin need extract the distinct columns from Fact Table before building
cube?</h4>
+<h4
id="why-does-kylin-need-to-extract-the-distinct-columns-from-fact-table-before-building-cube">Why
does Kylin need to extract the distinct columns from Fact Table before
building cube?</h4>
<ul>
- <li>Kylin uses dictionary to encode the values in each column, this greatly
reduce the cubeâs storage size. To build the dictionary, Kylin need fetch the
distinct values for each column.</li>
+ <li>Kylin uses dictionary to encode the values in each column, this
significantly reduces the cubeâs storage size. To build the dictionary, Kylin
needs to fetch the distinct values for each column.</li>
</ul>
-<h4 id="why-kylin-calculate-the-hive-table-cardinality">Why Kylin calculate
the HIVE table cardinality?</h4>
+<h4 id="why-does-kylin-calculate-the-hive-table-cardinality">Why does Kylin
calculate the HIVE table cardinality?</h4>
<ul>
- <li>The cardinality of dimensions is an important measure of cube
complexity. The higher the cardinality, the bigger the cube, and thus the
longer to build and the slower to query. Cardinality > 1,000 is worth
attention and > 1,000,000 should be avoided at best effort. For optimal cube
performance, try reduce high cardinality by categorize values or derive
features.</li>
+ <li>The cardinality of dimensions is an important measure of the cube
complexity. The higher the cardinality, the bigger the cube, and thus the
longer to build, and the slower to query. Cardinality > 1,000 is worth
attention and > 1,000,000 should be avoided at best effort. For optimal cube
performance, try to reduce high cardinality by categorize values or derive
features.</li>
</ul>
<h4 id="how-to-add-new-user-or-change-the-default-password">How to add new
user or change the default password?</h4>
@@ -9208,7 +9208,7 @@ group by a.slr_sgmt</code></pre></div>
<h4
id="failed-to-run-buildcubewithenginetest-saying-failed-to-connect-to-hbase-while-hbase-is-active">Failed
to run BuildCubeWithEngineTest, saying failed to connect to hbase while hbase
is active</h4>
<ul>
- <li>User may get this error when first time run hbase client, please check
the error trace to see whether there is an error saying couldnât access a
folder like â/hadoop/hbase/local/jarsâ; If that folder doesnât exist,
create it.</li>
+ <li>User may get this error when running hbase client the first time, please
check the error trace to see whether there is an error saying couldnât access
a folder like â/hadoop/hbase/local/jarsâ; If that folder doesnât exist,
create it.</li>
</ul>
<h4
id="kylin-jdbc-driver-returns-a-different-datetime-than-the-rest-api-seems-it-add-the-timezone-to-parse-the-date">Kylin
JDBC driver returns a different Date/time than the REST API, seems it add the
timezone to parse the date.</h4>
@@ -9216,7 +9216,7 @@ group by a.slr_sgmt</code></pre></div>
<li>Please check the <a
href="http://apache-kylin.74782.x6.nabble.com/JDBC-query-result-Date-column-get-wrong-value-td5370.html">post
in mailing list</a></li>
</ul>
-<h4
id="what-kind-of-data-be-left-in-kylinenvhdfs-working-dir--we-often-execute-kylin-cleanup-storage-command-but-now-our-working-dir-folder-is-about-300-gb-size-can-we-delete-old-data-manually">What
kind of data be left in âkylin.env.hdfs-working-dirâ ? We often execute
kylin cleanup storage command, but now our working dir folder is about 300 GB
size, can we delete old data manually?</h4>
+<h4
id="what-kind-of-data-is-left-in-kylinenvhdfs-working-dir--we-often-execute-kylin-cleanup-storage-command-but-now-our-working-dir-folder-is-about-300-gb-size-can-we-delete-old-data-manually">What
kind of data is left in âkylin.env.hdfs-working-dirâ ? We often execute
kylin cleanup storage command, but now our working dir folder is about 300 GB
size, can we delete old data manually?</h4>
<ul>
<li>
@@ -9228,7 +9228,7 @@ group by a.slr_sgmt</code></pre></div>
</ul>
<h4 id="how-to-escape-the-key-word-in-fuzzy-match-like-queries">How to escape
the key word in fuzzy match (like) queries?</h4>
-<p>â%â, â<em>â are key words in the âlikeâ clause; â%â matches
any character, and â</em>â matches a single character; When you wants to
match the key word like â<em>â, need to escape them with another character
ahead; Below is a sample with â/â to escape, the query is to match the
âxiao</em>â:<br />
+<p>â%â, â<em>â are keywords in the âlikeâ clause; â%â matches
any character, and â</em>â matches a single character; When you want to
match the keyword like â<em>â, you need to escape them with another
character ahead; Below is a sample with â/â to escape, the query is to
match the âxiao</em>â:<br />
âselect username from gg_user where username like â%xiao/_%â escape
â/â; â</p>
</article>
Modified: kylin/site/docs/install/kylin_cluster.html
URL:
http://svn.apache.org/viewvc/kylin/site/docs/install/kylin_cluster.html?rev=1885801&r1=1885800&r2=1885801&view=diff
==============================================================================
--- kylin/site/docs/install/kylin_cluster.html (original)
+++ kylin/site/docs/install/kylin_cluster.html Fri Jan 22 14:12:21 2021
@@ -8864,7 +8864,7 @@ The <em>job</em> mode means that the ser
<p>Then please add all job servers and query servers to the <code
class="highlighter-rouge">kylin.server.cluster-servers</code>.</p>
-<h3 id="use-curatorscheculer">Use <code
class="highlighter-rouge">CuratorScheculer</code></h3>
+<h3 id="use-curatorscheduler">Use <code
class="highlighter-rouge">CuratorScheduler</code></h3>
<p>Since v3.0.0-alpha, kylin introduces the Leader/Follower mode multiple job
engines scheduler based on Curator. Users can modify the following
configuration to enable CuratorScheduler:</p>
@@ -8877,15 +8877,15 @@ The <em>job</em> mode means that the ser
<h3 id="installing-a-load-balancer">Installing a load balancer</h3>
-<p>To send query requests to a cluster instead of a single node, you can
deploy a load balancer such as <a href="http://nginx.org/en/">Nginx</a>, <a
href="https://www.f5.com/">F5</a> or <a
href="https://rubygems.org/gems/cloudlb/">cloudlb</a>, etc., so that the client
and load balancer communication instead communicate with a specific Kylin
instance.</p>
+<p>To send query requests to a cluster instead of a single node, you can
deploy a load balancer such as <a href="http://nginx.org/en/">Nginx</a>, <a
href="https://www.f5.com/">F5</a> or <a
href="https://rubygems.org/gems/cloudlb/">cloudlb</a>, etc., so that the client
communicate with the load balancer instead of a specific Kylin instance.</p>
<h3 id="read-and-write-separation-deployment">Read and write separation
deployment</h3>
<p>For better stability and optimal performance, it is recommended to perform
a read-write separation deployment, deploying Kylin on two clusters as
follows:</p>
<ul>
- <li>A Hadoop cluster used to <em>Cube build</em>, which can be a large
cluster shared with other applications;</li>
- <li>An HBase cluster used to <em>SQL query</em>. Usually this cluster is
configured for Kylin. The number of nodes does not need to be as many as Hadoop
clusters. HBase configuration can be optimized for Kylin Cube read-only
features.</li>
+ <li>A Hadoop cluster used for <em>Cube build</em>, which can be a large
cluster shared with other applications;</li>
+ <li>An HBase cluster used for <em>SQL query</em>. Usually this cluster is
configured for Kylin. The number of nodes does not need to be as many as Hadoop
clusters. HBase configuration can be optimized for Kylin Cube read-only
features.</li>
</ul>
<p>This deployment strategy is the best deployment solution for the production
environment. For how to perform read-write separation deployment, please refer
to <a href="/blog/2016/06/10/standalone-hbase-cluster/">Deploy Apache Kylin
with Standalone HBase Cluster</a> .</p>
Modified: kylin/site/docs/install/kylin_on_kubernetes.html
URL:
http://svn.apache.org/viewvc/kylin/site/docs/install/kylin_on_kubernetes.html?rev=1885801&r1=1885800&r2=1885801&view=diff
==============================================================================
--- kylin/site/docs/install/kylin_on_kubernetes.html (original)
+++ kylin/site/docs/install/kylin_on_kubernetes.html Fri Jan 22 14:12:21 2021
@@ -8833,9 +8833,9 @@ var _hmt = _hmt || [];
<article
class="post-content" >
- <p>Kubernetes is a
portable, extensible, open-source platform for managing containerized workloads
and services, that facilitates both declarative configuration and automation.
It has a large, rapidly growing ecosystem. Kubernetes services, support, and
tools are widely available.</p>
+ <p>Kubernetes is a
portable, extensible, open-source platform for managing containerized workloads
and services, it facilitates both declarative configuration and automation. It
has a large, rapidly growing ecosystem. Kubernetes services, support and tools
are widely available.</p>
-<p>Apache Kylin is a open source, distributed analytical data warehouse for
big data. Deploy Kylin on Kubernetes cluster, will reduce cost of maintenance
and extension.</p>
+<p>Apache Kylin is an open source, distributed analytical data warehouse for
big data. Deploy Kylin on Kubernetes cluster will reduce cost of maintenance
and extension.</p>
<h2 id="directory">Directory</h2>
<p>Visit and download https://github.com/apache/kylin/tree/master/kubernetes
and you will find three directory:</p>
@@ -8844,14 +8844,14 @@ var _hmt = _hmt || [];
<li><strong>config</strong> <br />
Please update your configuration file here.</li>
<li><strong>template</strong> <br />
- This directory provided two deployment templates, one for quick-start
purpose, another for production/distributed deployment.
+ This directory provides two deployment templates, one for quick-start
purpose, another for production/distributed deployment.
<ul>
- <li>Quick-start template is for one node deployment with an ALL kylin
instance.</li>
- <li>Production template is for multi-nodes deployment with a few of
job/query kylin instances; and some other service like memcached and
filebeat(check doc at <a href="https://www.elastic.co/what-is/elk-stack">ELK
stack</a>) will help to satisfy log collection/query cache/session sharing
demand.</li>
+ <li>The quick-start template is for one node deployment with an ALL
kylin instance.</li>
+ <li>The production template is for multi-nodes deployment with a few
job/query kylin instances. Moreover, some other services like memcached and
filebeat(check doc at <a href="https://www.elastic.co/what-is/elk-stack">ELK
stack</a>) will help to satisfy log collection/query cache/session sharing
demand.</li>
</ul>
</li>
<li><strong>docker</strong> <br />
- Docker image is the pre-requirement of Kylin on Kubernetes, please check this
directory if you need build it yourself. For CDH5.x user, you may consider use
a provided image on DockerHub.</li>
+ Docker image is the pre-requirement of Kylin on Kubernetes, please check this
directory if you need to build it yourself. For CDH5.x user, you may consider
using a provided image on DockerHub.</li>
</ul>
<hr />
@@ -8862,38 +8862,38 @@ var _hmt = _hmt || [];
<li>A hadoop cluster.</li>
<li>A K8s cluster, with sufficient system resources.</li>
<li><strong>kylin-client</strong> image.</li>
- <li>A Elasticsearch cluster(maybe optional).</li>
+ <li>An Elasticsearch cluster(maybe optional).</li>
</ol>
<h2 id="how-to-build-docker-image">How to build docker image</h2>
<h3 id="hadoop-client-image">Hadoop-client image</h3>
-<p>What is hadoop-client docker image and why we need this?</p>
+<p>What is a hadoop-client docker image and why do we need this?</p>
-<p>As we all know, the node you want to deploy Kylin, should contains Hadoop
dependency(jars and configuration files), these dependency let you have access
to Hadoop Service, such as HDFS, HBase, Hive, which are needed by Apache Kylin.
Unfortunately, each Hadoop distribution(CHD or HDP etc.) has its own specific
jars. So, we can build specific image for specific Hadoop distribution, which
will make image management task more easier. This will have following two
benefits:</p>
+<p>As we all know, the node you want to deploy Kylin should contain Hadoop
dependencies(jars and configuration files), these dependencies let you have
access to Hadoop Services, such as HDFS, HBase, Hive, which are needed by
Apache Kylin. Unfortunately, each Hadoop distribution(CHD or HDP etc.) has its
own specific jars. So, we can build specific images for specific Hadoop
distributions, which will make image management task easier. This will have the
following two benefits:</p>
<ul>
- <li>Someone who has better knowledge on Hadoop can do this work, and let
kylin user build their Kylin image base on provided Hadoop-Client image.</li>
+ <li>Someone who has more knowledge on Hadoop can do this work, and let kylin
users build their Kylin image base on provided Hadoop-Client image.</li>
<li>Upgrade Kylin will be much easier.</li>
</ul>
<p>Build Step<br />
-- Prepare and modify Dockerfile(If you are using other hadoop distribution,
please consider build image yourself). <br />
+- Prepare and modify Dockerfile(If you are using other hadoop distribution,
please consider build an image yourself). <br />
- Place Spark binary(such as <code
class="highlighter-rouge">spark-2.3.2-bin-hadoop2.7.tgz</code>) into dir <code
class="highlighter-rouge">provided-binary</code>.<br />
-- Run <code class="highlighter-rouge">build-image.sh</code> to build image.</p>
+- Run <code class="highlighter-rouge">build-image.sh</code> to build the
image.</p>
<h3 id="kylin-client-image">Kylin-client image</h3>
-<p>What is kylin-client docker images?</p>
+<p>What is a kylin-client docker image?</p>
-<p><strong>kylin-client</strong> is a docker image which based on
<strong>hadoop-client</strong>, it will provided the flexibility of upgrade of
Apache Kylin.</p>
+<p><strong>kylin-client</strong> is a docker image which based on
<strong>hadoop-client</strong>, it will provide the flexibility of upgrade of
Apache Kylin.</p>
<p>Build Step</p>
<ul>
<li>Place Kylin binary(such as <code
class="highlighter-rouge">apache-kylin-3.0.1-bin-cdh57.tar.gz</code>) and
uncompress it into current dir.</li>
- <li>Modify <code class="highlighter-rouge">Dockerfile</code> , change the
value of <code class="highlighter-rouge">KYLIN_VERSION</code> and name of base
image(hadoop-client).</li>
+ <li>Modify <code class="highlighter-rouge">Dockerfile</code> , change the
value of <code class="highlighter-rouge">KYLIN_VERSION</code> and the name of
base image(hadoop-client).</li>
<li>Run <code class="highlighter-rouge">build-image.sh</code> to build
image.</li>
</ul>
@@ -8901,11 +8901,11 @@ var _hmt = _hmt || [];
<h2 id="how-to-deploy-kylin-on-kubernetes">How to deploy kylin on
kubernetes</h2>
-<p>Here letâs take a look of how to deploy a kylin cluster which connect to
CDH 5.7.</p>
+<p>Here letâs take a look at how to deploy a kylin cluster which connects to
CDH 5.7.</p>
<p>1 <code
class="highlighter-rouge">kubenetes/template/production/example/deployment</code>
is the working directory.</p>
-<p>2 Update hadoop configuration files (<code
class="highlighter-rouge">kubenetes/template/production/example/config/hadoop</code>)
and filebeat âs configuration file.</p>
+<p>2 Update hadoop configuration files (<code
class="highlighter-rouge">kubenetes/template/production/example/config/hadoop</code>)
and filebeatâs configuration file.</p>
<p>3 Create statefulset and service for memcached.</p>
@@ -8920,7 +8920,7 @@ statefulset.apps/kylin-memcached created
</div>
<ul>
- <li>Check hostname of cache service.</li>
+ <li>Check the hostname of cache service.</li>
</ul>
<div class="highlighter-rouge"><pre class="highlight"><code>$ kubectl run
-it--image=busybox:1.28.4--rm--restart=Never sh -n test-dns
@@ -8948,7 +8948,7 @@ $ vim ../config/kylin-query/kylin.proper
</div>
<ul>
- <li>Create configMap</li>
+ <li>Create the configMap</li>
</ul>
<div class="highlighter-rouge"><pre class="highlight"><code>$ kubectl create
configmap -n kylin-example hadoop-config \
@@ -9063,13 +9063,13 @@ $ kubectl get pod kylin-job-0 -n kylin-
</code></p>
</li>
<li>
- <p>If you donât have a Elasticsearch cluster or not interested in log
collection, please remove filebeat container in both kylin-query-stateful.yaml
and kylin-job-stateful.yaml.</p>
+ <p>If you donât have an Elasticsearch cluster or not interested in log
collection, please remove filebeat container in both kylin-query-stateful.yaml
and kylin-job-stateful.yaml.</p>
</li>
<li>
- <p>If you want to check detail or want to have a discussion, please read
or comment on <a
href="https://issues.apache.org/jira/browse/KYLIN-4447">KYLIN-4447 Kylin on
kubernetes in production env</a> .</p>
+ <p>If you want to check the details or want to have a discussion, please
read or comment on <a
href="https://issues.apache.org/jira/browse/KYLIN-4447">KYLIN-4447 Kylin on
kubernetes in production env</a> .</p>
</li>
<li>
- <p>Find provided docker image at: DockerHub: : <a
href="https://hub.docker.com/r/apachekylin/kylin-client">apachekylin/kylin-client</a></p>
+ <p>Find the provided docker image at: DockerHub: : <a
href="https://hub.docker.com/r/apachekylin/kylin-client">apachekylin/kylin-client</a></p>
</li>
</ul>
Modified: kylin/site/feed.xml
URL:
http://svn.apache.org/viewvc/kylin/site/feed.xml?rev=1885801&r1=1885800&r2=1885801&view=diff
==============================================================================
--- kylin/site/feed.xml (original)
+++ kylin/site/feed.xml Fri Jan 22 14:12:21 2021
@@ -19,8 +19,8 @@
<description>Apache Kylin Home</description>
<link>http://kylin.apache.org/</link>
<atom:link href="http://kylin.apache.org/feed.xml" rel="self"
type="application/rss+xml"/>
- <pubDate>Thu, 21 Jan 2021 05:59:18 -0800</pubDate>
- <lastBuildDate>Thu, 21 Jan 2021 05:59:18 -0800</lastBuildDate>
+ <pubDate>Fri, 22 Jan 2021 05:59:14 -0800</pubDate>
+ <lastBuildDate>Fri, 22 Jan 2021 05:59:14 -0800</lastBuildDate>
<generator>Jekyll v2.5.3</generator>
<item>