This is an automated email from the ASF dual-hosted git repository.

stigahuang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git


The following commit(s) were added to refs/heads/master by this push:
     new 2f55f8551 IMPALA-11519: [DOCS] add UTF-8 requirements
2f55f8551 is described below

commit 2f55f85519c0bd0baec83e74104a87fe4859726e
Author: Shajini Thayasingh <[email protected]>
AuthorDate: Tue Aug 23 09:01:12 2022 -0700

    IMPALA-11519: [DOCS] add UTF-8 requirements
    
    added a note about Glibc version and en_US.UTF-8 locale
    updated the notes in both topics
    Change-Id: I4d7a21c787c66868219c7bd64aa31f772de2f850
    Reviewed-on: http://gerrit.cloudera.org:8080/18897
    Reviewed-by: Quanlong Huang <[email protected]>
    Tested-by: Impala Public Jenkins <[email protected]>
---
 docs/topics/impala_components.xml | 67 ++++++++++++++++++++-------------------
 docs/topics/impala_utf_8.xml      | 13 ++++++--
 2 files changed, 45 insertions(+), 35 deletions(-)

diff --git a/docs/topics/impala_components.xml 
b/docs/topics/impala_components.xml
index 8f5f7f383..cb70eb84f 100644
--- a/docs/topics/impala_components.xml
+++ b/docs/topics/impala_components.xml
@@ -47,46 +47,47 @@ under the License.
 
     <conbody>
 
-      <p> The core Impala component is the Impala daemon, physically 
represented
-        by the <codeph>impalad</codeph> process. A few of the key functions 
that
-        an Impala daemon performs are:<ul>
+      <p> The core Impala component is the Impala daemon, physically 
represented by the
+          <codeph>impalad</codeph> process. A few of the key functions that an 
Impala daemon
+        performs are:<ul>
           <li>Reads and writes to data files.</li>
-          <li>Accepts queries transmitted from the 
<codeph>impala-shell</codeph>
-            command, Hue, JDBC, or ODBC.</li>
-          <li>Parallelizes the queries and distributes work across the
-            cluster.</li>
-          <li>Transmits intermediate query results back to the central
-            coordinator. </li>
+          <li>Accepts queries transmitted from the 
<codeph>impala-shell</codeph> command, Hue, JDBC,
+            or ODBC.</li>
+          <li>Parallelizes the queries and distributes work across the 
cluster.</li>
+          <li>Transmits intermediate query results back to the central 
coordinator. </li>
         </ul></p>
       <p>Impala daemons can be deployed in one of the following ways:<ul>
-          <li>HDFS and Impala are co-located, and each Impala daemon runs on 
the
-            same host as a DataNode.</li>
-          <li>Impala is deployed separately in a compute cluster and reads
-            remotely from HDFS, S3, ADLS, etc.</li>
+          <li>HDFS and Impala are co-located, and each Impala daemon runs on 
the same host as a
+            DataNode.</li>
+          <li>Impala is deployed separately in a compute cluster and reads 
remotely from HDFS, S3,
+            ADLS, etc.</li>
         </ul></p>
 
-      <p> The Impala daemons are in constant communication with StateStore, to
-        confirm which daemons are healthy and can accept new work. </p>
-
-      <p rev="1.2"> They also receive broadcast messages from the
-          <cmdname>catalogd</cmdname> daemon (introduced in Impala 1.2) 
whenever
-        any Impala daemon in the cluster creates, alters, or drops any type of
-        object, or when an <codeph>INSERT</codeph> or <codeph>LOAD 
DATA</codeph>
-        statement is processed through Impala. This background communication
-        minimizes the need for <codeph>REFRESH</codeph> or <codeph>INVALIDATE
-          METADATA</codeph> statements that were needed to coordinate metadata
-        across Impala daemons prior to Impala 1.2. </p>
-
-      <p rev="2.9.0 IMPALA-3807 IMPALA-5147 IMPALA-5503">
-        In <keyword keyref="impala29_full"/> and higher, you can control which 
hosts act as query coordinators
-        and which act as query executors, to improve scalability for highly 
concurrent workloads on large clusters.
-        See <xref keyref="scalability_coordinator"/> for details.
-      </p>
+      <p> The Impala daemons are in constant communication with StateStore, to 
confirm which daemons
+        are healthy and can accept new work. </p>
+
+      <p rev="1.2"> They also receive broadcast messages from the 
<cmdname>catalogd</cmdname> daemon
+        (introduced in Impala 1.2) whenever any Impala daemon in the cluster 
creates, alters, or
+        drops any type of object, or when an <codeph>INSERT</codeph> or 
<codeph>LOAD DATA</codeph>
+        statement is processed through Impala. This background communication 
minimizes the need for
+          <codeph>REFRESH</codeph> or <codeph>INVALIDATE METADATA</codeph> 
statements that were
+        needed to coordinate metadata across Impala daemons prior to Impala 
1.2. </p>
+
+      <p rev="2.9.0 IMPALA-3807 IMPALA-5147 IMPALA-5503"> In <keyword 
keyref="impala29_full"/> and
+        higher, you can control which hosts act as query coordinators and 
which act as query
+        executors, to improve scalability for highly concurrent workloads on 
large clusters. See
+          <xref keyref="scalability_coordinator"/> for details. </p>
+
+      <note>Impala daemons should be deployed on nodes using the same Glibc 
version since different
+        Glibc version supports different Unicode standard version and also 
ensure that the
+        en_US.UTF-8 locale is installed in the nodes. Not using the same Glibc 
version might result
+        in inconsistent UTF-8 behavior when UTF8_MODE is set to true.</note>
 
       <p>
-        <b>Related information:</b> <xref 
href="impala_config_options.xml#config_options"/>,
-        <xref href="impala_processes.xml#processes"/>, <xref 
href="impala_timeouts.xml#impalad_timeout"/>,
-        <xref href="impala_ports.xml#ports"/>, <xref 
href="impala_proxy.xml#proxy"/>
+        <b>Related information:</b>
+        <xref href="impala_config_options.xml#config_options"/>, <xref
+          href="impala_processes.xml#processes"/>, <xref 
href="impala_timeouts.xml#impalad_timeout"
+        />, <xref href="impala_ports.xml#ports"/>, <xref 
href="impala_proxy.xml#proxy"/>
       </p>
     </conbody>
   </concept>
diff --git a/docs/topics/impala_utf_8.xml b/docs/topics/impala_utf_8.xml
index fac6bce88..f6a5b8ed1 100644
--- a/docs/topics/impala_utf_8.xml
+++ b/docs/topics/impala_utf_8.xml
@@ -48,8 +48,17 @@ under the License.
     query option can be set globally, or at per session level. Only queries 
with UTF8_MODE=true will
     have UTF-8 aware behaviors.</p>
    <p>
-    <note>If the query option UTF8_MODE is turned on globally, existing 
queries that depend on the
-     original binary behavior need to explicitly set 
UTF8_MODE=false.</note></p>
+    <note>
+          <ul id="ul_vs2_qrx_p5b">
+            <li>If the query option UTF8_MODE is turned on globally, existing 
queries that depend on
+              the original binary behavior need to explicitly set 
UTF8_MODE=false.</li>
+            <li>Impala Daemons should be deployed on nodes using the same 
Glibc version since
+              different Glibc version supports different Unicode standard 
version and also ensure
+              that the en_US.UTF-8 locale is installed in the nodes. Not using 
the same Glibc
+              version might result in inconsistent UTF-8 behavior when 
UTF8_MODE is set to
+              true.</li>
+          </ul>
+        </note></p>
   </conbody>
  </concept>
  <concept id="list_string_functions">

Reply via email to