Hi Arthur, I went through the patch - just skimming through the diffs, will do more testing tomorrow. Here are a few initial comments.
1) max_shared_dictionaries_size / PGC_POSTMASTER I'm not quite sure why the GUC is defined as PGC_POSTMASTER, i.e. why it can't be changed after server start. That seems like a fairly useful thing to do (e.g. increase the limit while the server is running), and after looking at the code I think it shouldn't be difficult to change. The other thing I'd suggest is handling "-1" as "no limit". 2) max_shared_dictionaries_size / size of number Some of the comments dealing with the GUC treat it as a number of dictionaries (instead of a size). I suppose that's due to how the original patch was implemented. 3) Assert(max_shared_dictionaries_size); I'd say that assert is not very clear - it should be Assert(max_shared_dictionaries_size > 0); or something along the lines. It's also a good idea to add a comment explaining the assert, say /* we can only get here when shared dictionaries are enabled */ Assert(max_shared_dictionaries_size > 0); 4) I took the liberty of rewording some of the docs/comments. See the attached diffs, that should apply on top of 0003 and 0004 patches. Please, treat those as mere suggestions. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 6862d5e..6747fe2 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -1433,23 +1433,23 @@ include_dir 'conf.d' </term> <listitem> <para> - Sets the maximum size of all text search dictionaries loaded into shared - memory. The default is 100 megabytes (<literal>100MB</literal>). This - parameter can only be set at server start. + Specifies the amount of shared memory to be used to store full-text search + search dictionaries. The default is 100 megabytes (<literal>100MB</literal>). + This parameter can only be set at server start. </para> <para> - Currently controls only loading of <application>Ispell</application> - dictionaries (see <xref linkend="textsearch-ispell-dictionary"/>). - After compiling the dictionary it will be copied into shared memory. - Another backends on first use of the dictionary will use it from shared - memory, so it doesn't need to compile the dictionary second time. + Currently only <application>Ispell</application> dictionaries (see + <xref linkend="textsearch-ispell-dictionary"/>) may be loaded into + shared memory. The first backend requesting the dictionary will + build it and copy it into shared memory, so that other backends can + reuse it without having to build it again. </para> <para> - If total size of simultaneously loaded dictionaries reaches the maximum - allowed size then a new dictionary will be loaded into local memory of - a backend. + If the size of simultaneously loaded dictionaries reaches the maximum + allowed size, additional dictionares will be loaded into private backend + memory (effectively disabling the sharing). </para> </listitem> </varlistentry> diff --git a/src/backend/tsearch/ts_shared.c b/src/backend/tsearch/ts_shared.c index bfc5292..22d58a0 100644 --- a/src/backend/tsearch/ts_shared.c +++ b/src/backend/tsearch/ts_shared.c @@ -22,7 +22,7 @@ /* - * Hash table structures + * Hash table entries representing shared dictionaries. */ typedef struct { @@ -37,7 +37,8 @@ typedef struct static dshash_table *dict_table = NULL; /* - * Shared struct for locking + * Information about the main shmem segment, used to coordinate + * access to the hash table and dictionaries. */ typedef struct { @@ -53,8 +54,8 @@ typedef struct static TsearchCtlData *tsearch_ctl; /* - * GUC variable for maximum number of shared dictionaries. Default value is - * 100MB. + * Maximum allowed amount of shared memory for shared dictionaries, + * in kilobytes. Default value is 100MB. */ int max_shared_dictionaries_size = 100 * 1024; @@ -213,7 +202,7 @@ ts_dict_shmem_location(DictInitData *initoptions, /* * Release memory occupied by the dictionary. Function just unpins DSM mapping. - * If nobody else hasn't mapping to this DSM then unping DSM segment. + * If nobody else hasn't mapping to this DSM then unpin DSM segment. * * dictid: Oid of the dictionary. */ @@ -312,6 +301,7 @@ init_dict_table(void) MemoryContext old_context; dsa_area *dsa; + /* bail out if shared dictionaries not allowed */ if (max_shared_dictionaries_size == 0) return; diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 172627a..b10ec48 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -2939,7 +2939,7 @@ static struct config_int ConfigureNamesInt[] = gettext_noop("Currently controls only loading of Ispell dictionaries. " "If total size of simultaneously loaded dictionaries " "reaches the maximum allowed size then a new dictionary " - "will be loaded into local memory of a backend."), + "will be loaded into private backend memory."), GUC_UNIT_KB, }, &max_shared_dictionaries_size,
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml index 82afe20..d1ae7b7 100644 --- a/doc/src/sgml/textsearch.sgml +++ b/doc/src/sgml/textsearch.sgml @@ -3034,15 +3034,17 @@ CREATE TEXT SEARCH DICTIONARY english_stem ( <title>Dictionaries in Shared Memory</title> <para> - Some dictionaries, especially <application>Ispell</application>, consumes a - noticable value of memory. Size of a dictionary can reach tens of megabytes. - Most of them also stores configuration in text files. A dictionary is compiled - during first access per a user session. + Some dictionaries, especially <application>Ispell</application>, consumes + a significant amount of memory, in some cases tens of megabytes. Most of + them store the data in text files, and building the in-memory structure is + both CPU and time-consuming. Instead of doing this in each backend when + it needs a dictionary for the first time, the compiled dictionary may be + stored in shared memory so that it may be reused by other backends. </para> <para> - To store dictionaries in shared memory set to <xref linkend="guc-max-shared-dictionaries-size"/> - parameter value greater than zero before server starting. + To enable storing dictionaries in shared memory, set <xref linkend="guc-max-shared-dictionaries-size"/> + parameter to a value greater than zero. </para> </sect2> diff --git a/src/backend/tsearch/dict_ispell.c b/src/backend/tsearch/dict_ispell.c index f8ab16d..43aa27a 100644 --- a/src/backend/tsearch/dict_ispell.c +++ b/src/backend/tsearch/dict_ispell.c @@ -5,14 +5,14 @@ * * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group * - * By default all Ispell dictionaries are stored in DSM. But if number of - * loaded dictionaries reached maximum allowed value then it will be - * allocated within its memory context (dictCtx). + * By default all Ispell dictionaries are stored in DSM. But if the amount + * of memory exceeds max_shared_dictionaries_size, then the dictionary be + * allocated in private backend memory (in dictCtx context). * * All necessary data are built within dispell_build() function. But * structures for regular expressions are compiled on first demand and * stored using AffixReg array. It is because regex_t and Regis cannot be - * stored in shared memory. + * stored in shared memory easily. * * * IDENTIFICATION