> On Sep 18, 2020, at 9:41 AM, John Naylor <john.nay...@2ndquadrant.com> wrote:
> 
> Attached is version 4, which excludes the output file from pgindent,
> to match recent commit 74d4608f5. Since it won't be indented again, I
> also tweaked the generator script to match pgindent for the typedef,
> since we don't want to lose what pgindent has fixed already. This last
> part isn't new to v4, but I thought I'd highlight it anyway.
> 
> --
> John Naylor                https://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> <v4-0001-Tweak-the-set-of-candidate-multipliers-for-genera.patch><v4-0002-Use-perfect-hashing-for-NFC-Unicode-normalization.patch><v4-0003-Use-perfect-hashing-for-NKFC-Unicode-normalizatio.patch>

0002 and 0003 look good to me.  I like the way you cleaned up a bit with the 
unicode_norm_props struct, which makes the code a bit more tidy, and on my 
compiler under -O2 it does not generate any extra runtime dereferences, as the 
compiler can see through the struct just fine.  My only concern would be if 
some other compilers might not see through the struct, resulting in a runtime 
performance cost?  I wouldn't even ask, except that qc_hash_lookup is called in 
a fairly tight loop.

To clarify, the following changes to the generated code which remove the struct 
and corresponding dereferences (not intended as a patch submission) cause zero 
bytes of change in the compiled output for me on mac/clang, which is good, and 
generate inconsequential changes on linux/gcc, which is also good, but I wonder 
if that is true for all compilers.  In your commit message for 0001 you 
mentioned testing on a multiplicity of compilers.  Did you do that for 0002 and 
0003 as well?

diff --git a/src/common/unicode_norm.c b/src/common/unicode_norm.c
index 1714837e64..976b96e332 100644
--- a/src/common/unicode_norm.c
+++ b/src/common/unicode_norm.c
@@ -476,8 +476,11 @@ qc_compare(const void *p1, const void *p2)
        return (v1 - v2);
 }
 
-static const pg_unicode_normprops *
-qc_hash_lookup(pg_wchar ch, const unicode_norm_info * norminfo)
+static inline const pg_unicode_normprops *
+qc_hash_lookup(pg_wchar ch,
+                          const pg_unicode_normprops *normprops,
+                          qc_hash_func hash,
+                          int num_normprops)
 {
        int                     h;
        uint32          hashkey;
@@ -487,21 +490,21 @@ qc_hash_lookup(pg_wchar ch, const unicode_norm_info * 
norminfo)
         * in network order.
         */
        hashkey = htonl(ch);
-       h = norminfo->hash(&hashkey);
+       h = hash(&hashkey);
 
        /* An out-of-range result implies no match */
-       if (h < 0 || h >= norminfo->num_normprops)
+       if (h < 0 || h >= num_normprops)
                return NULL;
 
        /*
         * Since it's a perfect hash, we need only match to the specific 
codepoint
         * it identifies.
         */
-       if (ch != norminfo->normprops[h].codepoint)
+       if (ch != normprops[h].codepoint)
                return NULL;
 
        /* Success! */
-       return &norminfo->normprops[h];
+       return &normprops[h];
 }
 
 /*
@@ -518,7 +521,10 @@ qc_is_allowed(UnicodeNormalizationForm form, pg_wchar ch)
        switch (form)
        {
                case UNICODE_NFC:
-                       found = qc_hash_lookup(ch, &UnicodeNormInfo_NFC_QC);
+                       found = qc_hash_lookup(ch,
+                                                                  
UnicodeNormProps_NFC_QC,
+                                                                  
NFC_QC_hash_func,
+                                                                  
NFC_QC_num_normprops);
                        break;
                case UNICODE_NFKC:
                        found = bsearch(&key,
diff --git a/src/include/common/unicode_normprops_table.h 
b/src/include/common/unicode_normprops_table.h
index 5e1e382af5..38300cfa12 100644
--- a/src/include/common/unicode_normprops_table.h
+++ b/src/include/common/unicode_normprops_table.h
@@ -13,13 +13,6 @@ typedef struct
        signed int      quickcheck:4;   /* really UnicodeNormalizationQC */
 } pg_unicode_normprops;
 
-typedef struct
-{
-       const pg_unicode_normprops *normprops;
-       qc_hash_func hash;
-       int                     num_normprops;
-} unicode_norm_info;
-
 static const pg_unicode_normprops UnicodeNormProps_NFC_QC[] = {
        {0x0300, UNICODE_NORM_QC_MAYBE},
        {0x0301, UNICODE_NORM_QC_MAYBE},
@@ -1583,12 +1576,6 @@ NFC_QC_hash_func(const void *key)
        return h[a % 2463] + h[b % 2463];
 }
 
-static const unicode_norm_info UnicodeNormInfo_NFC_QC = {
-       UnicodeNormProps_NFC_QC,
-       NFC_QC_hash_func,
-       1231
-};
-
 static const pg_unicode_normprops UnicodeNormProps_NFKC_QC[] = {
        {0x00A0, UNICODE_NORM_QC_NO},
        {0x00A8, UNICODE_NORM_QC_NO},


--
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company





Reply via email to