Hello 2010/9/29 Itagaki Takahiro <itagaki.takah...@gmail.com>: > On Thu, Sep 9, 2010 at 8:57 PM, Pavel Stehule <pavel.steh...@gmail.com> wrote: >> I am sending a updated version. >> >> changes: >> * tag %v removed from format function, >> * proprietary tags %lq a iq removed from sprintf >> * code cleaned >> >> patch divided to two parts - format function and stringfunc (contains >> sprintf function and substitute function) > > === Discussions about the spec === > Two patches add format() into the core, and substitute() and sprintf() into > stringfunc contrib module. But will we have 3 versions of string formatters? > > IMHO, substitute() is the best choice that we will have in the core because > functionalities in format() and sprintf() can be achieved by combination of > substitute() and quote_nullable(), quote_ident(), or to_char(). I think the > core will provide only simple and non-overlapped features. Users can write > wrapper functions by themselves if they think the description is redundant.
I think we need a three variants of formating functions - "format" in core, fo simply creating and building a messages, a SQL strings, "sprintf" for traditionalist in contrib - this functions isn't well joined to SQL environment and it's too heavy - more it overwrite a some functionality of "to_char" function. "substitute" function provide just positional unformatted parameters - that isn't typical ucase - so must not be in core too. > > === format.diff === > * It has a reject in doc, but the hunk can be fixed easily. > 1 out of 2 hunks FAILED -- saving rejects to file > doc/src/sgml/func.sgml.rej > COMMENT: We have the function list in alphabetical order, fixed > so format() should be inserted after encode(). > * It can be built without compile warnings. > * Enough documentation and regression tests are included. > > === stringfunc.diff === > * It can be applied cleanly and built without compile warnings. > * Documentation is included, but not enough. > COMMENT: According to existing docs, function list are described with > <variablelist> or <table>. fixed > * Enough regression tests are included. > * COMMENT: stringfunc directory should be added to contrib/Makefile. > > * BUG: stringfunc_substitute_nv() calls text_format(). > I think we don't need stringfunc_substitute_nv at all. > It can be replaced by stringfunc_substitute(). _nv version is only > required if it is in the core because of sanity regression test. you have a true - but I am not sure about coding patters for contribs, so I designed it with respect to core sanity check. > > * BUG?: The doc says sprintf() doesn't support length modifiers, > but it is actually supported in broken state: I was wrong in documentation - length modifiers are supported - positional modifiers are not supported. fixed. > postgres=# SELECT sprintf('%*s', 2, 'ABC'); > sprintf > --------- > ABC <= should be ERROR if unsupported, or AB if supported. > (1 row) it works well - "with" modifier doesn't reduce string. String is stripped by "precision" modifiers. SELECT sprintf('%*.s', 2, ABC) --> AB checked via gcc please, try printf(">>%s<<\n", "12345678"); printf(">>%3s<<\n", "12345678"); printf(">>%.3s<<\n", "12345678"); printf(">>%10.3s<<\n", "12345678"); do you understand me, why I "dislike" "printf"? How much people knows well these formatting rules? > > * BUG?: ereport(ERROR, > (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > errmsg("unsupported tag \"%%%c\"", tag))); > Is the code ok if the tag (a char) is a partial byte of multi-byte character? it's bug - the supported tags are only single byte, but unsupported tag can be multibyte character and must by showed correctly - fixed. > My machine prints ? in the case, but it might be platform-dependent. > > === Both patches === > * Performance: I don't think those functions are not performance-critical, > but we could cache typoutput functions in fn_extra if needed. > record_out would be a reference. I though about it too and I checked it now - there is 0.4% performance on 10000000 rows on my PC (format function) - so I don't do any changes - caching of oids means a few lines more - but here isn't expected effect. > > * Coding: Whitespace and tabs are mixed in some places. They are not so > important because we will run pgindent, but careful choice will be > preferred even of a patch. > checked, fixed Thank you very much for review regards Pavel Stehule > -- > Itagaki Takahiro >
*** ./doc/src/sgml/func.sgml.orig 2010-09-09 02:48:22.000000000 +0200 --- ./doc/src/sgml/func.sgml 2010-09-29 07:18:59.845395002 +0200 *************** *** 1272,1277 **** --- 1272,1280 ---- <primary>encode</primary> </indexterm> <indexterm> + <primary>format</primary> + </indexterm> + <indexterm> <primary>initcap</primary> </indexterm> <indexterm> *************** *** 1495,1500 **** --- 1498,1520 ---- <entry><literal>encode(E'123\\000\\001', 'base64')</literal></entry> <entry><literal>MTIzAAE=</literal></entry> </row> + + </row> + <entry> + <literal><function>format</function>(<parameter>formatstr</parameter> <type>text</type> + [, <parameter>str</parameter> <type>"any"</type> [, ...] ])</literal> + </entry> + <entry><type>text</type></entry> + <entry> + This functions can be used to create a formated string or message. There are allowed + three types of tags: %s as string, %i as SQL identifiers and %l as SQL literals. Attention: + result for %i and %l must not be same as result of <function>quote_ident</function> and + <function>quote_literal</function> functions, because this function doesn't try to coerce + parameters to <type>text</type> type and directly use a type's output functions. + </entry> + <entry><literal>format('Hello %s', 'World')</literal></entry> + <entry><literal>Hello World</literal></entry> + </row> <row> <entry><literal><function>initcap(<parameter>string</parameter>)</function></literal></entry> *** ./src/backend/utils/adt/varlena.c.orig 2010-09-29 07:16:25.744395786 +0200 --- ./src/backend/utils/adt/varlena.c 2010-09-29 09:51:39.340270718 +0200 *************** *** 21,28 **** --- 21,30 ---- #include "libpq/md5.h" #include "libpq/pqformat.h" #include "miscadmin.h" + #include "parser/parse_coerce.h" #include "parser/scansup.h" #include "regex/regex.h" + #include "utils/array.h" #include "utils/builtins.h" #include "utils/bytea.h" #include "utils/lsyscache.h" *************** *** 48,53 **** --- 50,61 ---- int skiptable[256]; /* skip distance for given mismatched char */ } TextPositionState; + typedef struct + { + int n_valid_oids; + Oid typoutput[FUNC_MAX_ARGS]; + } format_fn_extra_cache; + #define DatumGetUnknownP(X) ((unknown *) PG_DETOAST_DATUM(X)) #define DatumGetUnknownPCopy(X) ((unknown *) PG_DETOAST_DATUM_COPY(X)) #define PG_GETARG_UNKNOWN_P(n) DatumGetUnknownP(PG_GETARG_DATUM(n)) *************** *** 3702,3704 **** --- 3710,3866 ---- PG_RETURN_TEXT_P(result); } + + /* + * Text format - a variadic function replaces %c symbols with entered text. + */ + Datum + text_format(PG_FUNCTION_ARGS) + { + text *fmt; + StringInfoData str; + char *cp; + int i = 1; + size_t len; + char *start_ptr; + char *end_ptr; + text *result; + + /* When format string is null, returns null */ + if (PG_ARGISNULL(0)) + PG_RETURN_NULL(); + + fmt = PG_GETARG_TEXT_PP(0); + len = VARSIZE_ANY_EXHDR(fmt); + start_ptr = VARDATA_ANY(fmt); + end_ptr = start_ptr + len - 1; + + initStringInfo(&str); + for (cp = start_ptr; cp <= end_ptr; cp++) + { + /* + * there are allowed escape char - '\' + */ + if (cp[0] == '\\') + { + /* check next char */ + if (cp < end_ptr) + { + switch (cp[1]) + { + case '\\': + case '%': + appendStringInfoChar(&str, cp[1]); + break; + + default: + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("unsupported escape sequence \\%.*s", pg_mblen(&cp[1]), &cp[1]))); + } + cp++; + } + else + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("broken escape sequence"))); + } + else if (cp[0] == '%') + { + char tag; + + /* initial check */ + if (cp == end_ptr) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("missing formating tag"))); + + if (i >= PG_NARGS()) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("too few parameters for format function"))); + + tag = cp[1]; + cp++; + + if (!PG_ARGISNULL(i)) + { + Oid valtype; + Datum value; + Oid typoutput; + bool typIsVarlena; + + /* append n-th value */ + value = PG_GETARG_DATUM(i); + valtype = get_fn_expr_argtype(fcinfo->flinfo, i); + getTypeOutputInfo(valtype, &typoutput, &typIsVarlena); + + if (tag == 's') + { + /* show it as unspecified string */ + appendStringInfoString(&str, OidOutputFunctionCall(typoutput, value)); + } + else if (tag == 'i') + { + char *target_value; + + /* show it as sql identifier */ + target_value = OidOutputFunctionCall(typoutput, value); + appendStringInfoString(&str, quote_identifier(target_value)); + } + else if (tag == 'l') + { + text *txt; + text *quoted_txt; + + /* get text value and quotize */ + txt = cstring_to_text(OidOutputFunctionCall(typoutput, value)); + quoted_txt = DatumGetTextP(DirectFunctionCall1(quote_literal, + PointerGetDatum(txt))); + appendStringInfoString(&str, text_to_cstring(quoted_txt)); + } + else + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("unsupported tag \"%%%.*s\"", pg_mblen(cp), cp))); + } + else + { + if (tag == 'i') + ereport(ERROR, + (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), + errmsg("NULL is used as SQL identifier"))); + else if (tag == 'l') + appendStringInfoString(&str, "NULL"); + else if (tag != 's') + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("unsupported tag \"%%%.*s\"", pg_mblen(cp), cp))); + } + i++; + } + else + appendStringInfoChar(&str, cp[0]); + } + + /* check if all arguments are used */ + if (i != PG_NARGS()) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("too many parameters for format function"))); + + result = cstring_to_text_with_len(str.data, str.len); + pfree(str.data); + + PG_RETURN_TEXT_P(result); + } + + /* + * Non variadic text_format function - only wrapper + * Print and check format string + */ + Datum + text_format_nv(PG_FUNCTION_ARGS) + { + return text_format(fcinfo); + } *** ./src/include/catalog/pg_proc.h.orig 2010-09-03 03:34:55.000000000 +0200 --- ./src/include/catalog/pg_proc.h 2010-09-29 07:16:42.658395449 +0200 *************** *** 2741,2746 **** --- 2741,2750 ---- DESCR("return the last n characters"); DATA(insert OID = 3062 ( reverse PGNSP PGUID 12 1 0 0 f f f t f i 1 0 25 "25" _null_ _null_ _null_ _null_ text_reverse _null_ _null_ _null_ )); DESCR("reverse text"); + DATA(insert OID = 3063 ( format PGNSP PGUID 12 1 0 2276 f f f f f s 2 0 25 "25 2276" "{25,2276}" "{i,v}" _null_ _null_ text_format _null_ _null_ _null_ )); + DESCR("format text message"); + DATA(insert OID = 3064 ( format PGNSP PGUID 12 1 0 0 f f f f f s 1 0 25 "25" _null_ _null_ _null_ _null_ text_format_nv _null_ _null_ _null_ )); + DESCR("format text message"); DATA(insert OID = 1810 ( bit_length PGNSP PGUID 14 1 0 0 f f f t f i 1 0 23 "17" _null_ _null_ _null_ _null_ "select pg_catalog.octet_length($1) * 8" _null_ _null_ _null_ )); DESCR("length in bits"); *** ./src/include/utils/builtins.h.orig 2010-09-03 03:34:55.000000000 +0200 --- ./src/include/utils/builtins.h 2010-09-29 07:16:42.661395833 +0200 *************** *** 742,747 **** --- 742,749 ---- extern Datum text_left(PG_FUNCTION_ARGS); extern Datum text_right(PG_FUNCTION_ARGS); extern Datum text_reverse(PG_FUNCTION_ARGS); + extern Datum text_format(PG_FUNCTION_ARGS); + extern Datum text_format_nv(PG_FUNCTION_ARGS); /* version.c */ extern Datum pgsql_version(PG_FUNCTION_ARGS); *** ./src/test/regress/expected/text.out.orig 2010-09-29 07:16:25.749395800 +0200 --- ./src/test/regress/expected/text.out 2010-09-29 07:16:42.662395333 +0200 *************** *** 118,120 **** --- 118,139 ---- 5 | ahoj | ahoj (11 rows) + select format('some text'); + format + ----------- + some text + (1 row) + + select format('-- insert into %i (a,b,c) values(%l,%l,%l)', 'My tab', 'Hello', NULL, 10); + format + ----------------------------------------------------------- + -- insert into "My tab" (a,b,c) values('Hello',NULL,'10') + (1 row) + + -- should fail + select format('-- insert into %i (a,b,c) values(%l,%l,%l)', NULL, 'Hello', NULL, 10); + ERROR: NULL is used as SQL identifier + select format('-- insert into %i (a,b,c) values(%l,%l,%l)', 'My tab', NULL, 'Hello'); + ERROR: too few parameters for format function + select format('-- insert into %i (a,b,c) values(%l,%l,%l)', 'My tab', 'Hello', NULL, 10, 10); + ERROR: too many parameters for format function *** ./src/test/regress/sql/text.sql.orig 2010-09-29 07:16:25.751395218 +0200 --- ./src/test/regress/sql/text.sql 2010-09-29 07:16:42.662395333 +0200 *************** *** 41,43 **** --- 41,49 ---- select concat_ws(NULL,10,20,null,30) is null; select reverse('abcde'); select i, left('ahoj', i), right('ahoj', i) from generate_series(-5, 5) t(i) order by i; + select format('some text'); + select format('-- insert into %i (a,b,c) values(%l,%l,%l)', 'My tab', 'Hello', NULL, 10); + -- should fail + select format('-- insert into %i (a,b,c) values(%l,%l,%l)', NULL, 'Hello', NULL, 10); + select format('-- insert into %i (a,b,c) values(%l,%l,%l)', 'My tab', NULL, 'Hello'); + select format('-- insert into %i (a,b,c) values(%l,%l,%l)', 'My tab', 'Hello', NULL, 10, 10);
*** ./contrib/Makefile.orig 2010-06-14 18:17:56.000000000 +0200 --- ./contrib/Makefile 2010-09-29 08:14:37.569273826 +0200 *************** *** 40,45 **** --- 40,46 ---- pgstattuple \ seg \ spi \ + stringfunc \ tablefunc \ test_parser \ tsearch2 \ *** ./contrib/stringfunc/expected/stringfunc.out.orig 2010-09-29 08:11:40.945271948 +0200 --- ./contrib/stringfunc/expected/stringfunc.out 2010-09-29 08:10:48.363270909 +0200 *************** *** 0 **** --- 1,101 ---- + SET client_min_messages = warning; + \set ECHO none + RESET client_min_messages; + -- sprintf test + select sprintf('>>>%10s %10d<<<', 'hello', 10); + sprintf + ----------------------------- + >>> hello 10<<< + (1 row) + + select sprintf('>>>%-10s<<<', 'hello'); + sprintf + ------------------ + >>>hello <<< + (1 row) + + select sprintf('>>>%5.2<<<', 'abcde'); + ERROR: unsupported sprintf format tag '<' + select sprintf('>>>%*s<<<', 10, 'abcdef'); + sprintf + ------------------ + >>> abcdef<<< + (1 row) + + select sprintf('>>>%*s<<<', 10); -- error + ERROR: too few parameters specified for printf function + select sprintf('%010d', 10); + sprintf + ------------ + 0000000010 + (1 row) + + select sprintf('%.6d', 10); + sprintf + --------- + 000010 + (1 row) + + select sprintf('%d', 100.0/3.0); + sprintf + --------- + 33 + (1 row) + + select sprintf('%e', 100.0/3.0); + sprintf + -------------- + 3.333333e+01 + (1 row) + + select sprintf('%f', 100.0/3.0); + sprintf + ----------- + 33.333333 + (1 row) + + select sprintf('%g', 100.0/3.0); + sprintf + --------- + 33.3333 + (1 row) + + select sprintf('%7.4e', 100.0/3.0); + sprintf + ------------ + 3.3333e+01 + (1 row) + + select sprintf('%7.4f', 100.0/3.0); + sprintf + --------- + 33.3333 + (1 row) + + select sprintf('%7.4g', 100.0/3.0); + sprintf + --------- + 33.33 + (1 row) + + select sprintf('%d', NULL); + sprintf + --------- + <NULL> + (1 row) + + select substitute('second parameter is $2 and first parameter is $1', 'FIRST', 'SECOND'); + substitute + --------------------------------------------------------- + second parameter is SECOND and first parameter is FIRST + (1 row) + + -- should fail + select substitute('third parameter is $3 and first parameter is $1', 'FIRST', 'SECOND'); + ERROR: positional placeholder "$3" is not valid + select substitute(' NULL parameter is $1', NULL); + substitute + --------------------------- + NULL parameter is <NULL> + (1 row) + *** ./contrib/stringfunc/Makefile.orig 2010-09-29 08:11:59.217397393 +0200 --- ./contrib/stringfunc/Makefile 2010-09-29 08:10:48.364270479 +0200 *************** *** 0 **** --- 1,20 ---- + # $PostgreSQL: pgsql/contrib/stringfunc/Makefile,v 1.23 2009/08/28 20:26:18 petere Exp $ + + MODULE_big = stringfunc + OBJS= stringfunc.o + + DATA_built = stringfunc.sql + DATA = uninstall_stringfunc.sql + REGRESS = stringfunc + + ifdef USE_PGXS + PG_CONFIG = pg_config + PGXS := $(shell $(PG_CONFIG) --pgxs) + include $(PGXS) + else + subdir = contrib/stringfunc + top_builddir = ../.. + include $(top_builddir)/src/Makefile.global + include $(top_srcdir)/contrib/contrib-global.mk + endif + *** ./contrib/stringfunc/sql/stringfunc.sql.orig 2010-09-29 08:11:52.441397426 +0200 --- ./contrib/stringfunc/sql/stringfunc.sql 2010-09-29 08:10:48.364270479 +0200 *************** *** 0 **** --- 1,28 ---- + SET client_min_messages = warning; + \set ECHO none + \i stringfunc.sql + \set ECHO all + RESET client_min_messages; + + -- sprintf test + select sprintf('>>>%10s %10d<<<', 'hello', 10); + select sprintf('>>>%-10s<<<', 'hello'); + select sprintf('>>>%5.2<<<', 'abcde'); + select sprintf('>>>%*s<<<', 10, 'abcdef'); + select sprintf('>>>%*s<<<', 10); -- error + select sprintf('%010d', 10); + select sprintf('%.6d', 10); + + select sprintf('%d', 100.0/3.0); + select sprintf('%e', 100.0/3.0); + select sprintf('%f', 100.0/3.0); + select sprintf('%g', 100.0/3.0); + select sprintf('%7.4e', 100.0/3.0); + select sprintf('%7.4f', 100.0/3.0); + select sprintf('%7.4g', 100.0/3.0); + select sprintf('%d', NULL); + + select substitute('second parameter is $2 and first parameter is $1', 'FIRST', 'SECOND'); + -- should fail + select substitute('third parameter is $3 and first parameter is $1', 'FIRST', 'SECOND'); + select substitute(' NULL parameter is $1', NULL); \ No newline at end of file *** ./contrib/stringfunc/stringfunc.c.orig 2010-09-29 08:12:07.209397369 +0200 --- ./contrib/stringfunc/stringfunc.c 2010-09-29 09:42:33.306272084 +0200 *************** *** 0 **** --- 1,666 ---- + #include "postgres.h" + #include "string.h" + + #include "catalog/pg_type.h" + #include "lib/stringinfo.h" + #include "mb/pg_wchar.h" + #include "parser/parse_coerce.h" + #include "utils/array.h" + #include "utils/builtins.h" + #include "utils/lsyscache.h" + + PG_MODULE_MAGIC; + + #define CHECK_PAD(symbol, pad_value) \ + do { \ + if (pdesc->flags & pad_value) \ + ereport(ERROR, \ + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), \ + errmsg("broken sprintf format"), \ + errdetail("Format string is '%s'.", TextDatumGetCString(fmt)), \ + errhint("Symbol '%c' can be used only one time.", symbol))); \ + pdesc->flags |= pad_value; \ + } while(0); + + /* + * string functions + */ + Datum stringfunc_sprintf(PG_FUNCTION_ARGS); + Datum stringfunc_sprintf_nv(PG_FUNCTION_ARGS); + Datum stringfunc_substitute(PG_FUNCTION_ARGS); + Datum stringfunc_substitute_nv(PG_FUNCTION_ARGS); + + /* + * V1 registrations + */ + PG_FUNCTION_INFO_V1(stringfunc_sprintf); + PG_FUNCTION_INFO_V1(stringfunc_sprintf_nv); + PG_FUNCTION_INFO_V1(stringfunc_substitute); + PG_FUNCTION_INFO_V1(stringfunc_substitute_nv); + + typedef enum { + stringfunc_ZERO = 1, + stringfunc_SPACE = 2, + stringfunc_PLUS = 4, + stringfunc_MINUS = 8, + stringfunc_STAR_WIDTH = 16, + stringfunc_SHARP = 32, + stringfunc_WIDTH = 64, + stringfunc_PRECISION = 128, + stringfunc_STAR_PRECISION = 256 + } PlaceholderTags; + + typedef struct { + int flags; + char field_type; + char lenmod; + int32 width; + int32 precision; + } FormatPlaceholderData; + + typedef FormatPlaceholderData *PlaceholderDesc; + + static Datum + castValueTo(Datum value, Oid targetTypeId, Oid inputTypeId) + { + Oid funcId; + CoercionPathType pathtype; + FmgrInfo finfo; + Datum result; + + if (inputTypeId != UNKNOWNOID) + pathtype = find_coercion_pathway(targetTypeId, inputTypeId, + COERCION_EXPLICIT, + &funcId); + else + pathtype = COERCION_PATH_COERCEVIAIO; + + switch (pathtype) + { + case COERCION_PATH_RELABELTYPE: + result = value; + break; + case COERCION_PATH_FUNC: + { + Assert(OidIsValid(funcId)); + + fmgr_info(funcId, &finfo); + result = FunctionCall1(&finfo, value); + } + break; + + case COERCION_PATH_COERCEVIAIO: + { + Oid typoutput; + Oid typinput; + bool typIsVarlena; + Oid typIOParam; + char *extval; + + getTypeOutputInfo(inputTypeId, &typoutput, &typIsVarlena); + extval = OidOutputFunctionCall(typoutput, value); + + getTypeInputInfo(targetTypeId, &typinput, &typIOParam); + result = OidInputFunctionCall(typinput, extval, typIOParam, -1); + } + break; + + default: + elog(ERROR, "failed to find conversion function from %s to %s", + format_type_be(inputTypeId), format_type_be(targetTypeId)); + /* be compiler quiet */ + result = (Datum) 0; + } + + return result; + } + + /* + * parse and verify sprintf parameter + * + * %[flags][width][.precision]specifier + * + */ + static char * + parsePlaceholder(char *src, char *end_ptr, PlaceholderDesc pdesc, text *fmt) + { + char c; + + pdesc->field_type = '\0'; + pdesc->lenmod = '\0'; + pdesc->flags = 0; + pdesc->width = 0; + pdesc->precision = 0; + + while (src < end_ptr && pdesc->field_type == '\0') + { + c = *++src; + + switch (c) + { + case '0': + CHECK_PAD('0', stringfunc_ZERO); + break; + case ' ': + CHECK_PAD(' ', stringfunc_SPACE); + break; + case '+': + CHECK_PAD('+', stringfunc_PLUS); + break; + case '-': + CHECK_PAD('-', stringfunc_MINUS); + break; + case '*': + CHECK_PAD('*', stringfunc_STAR_WIDTH); + break; + case '#': + CHECK_PAD('#', stringfunc_SHARP); + break; + case 'o': case 'i': case 'e': case 'E': case 'f': + case 'g': case 'd': case 's': case 'x': case 'X': + pdesc->field_type = *src; + break; + case '1': case '2': case '3': case '4': + case '5': case '6': case '7': case '8': case '9': + CHECK_PAD('9', stringfunc_WIDTH); + pdesc->width = c - '0'; + while (src < end_ptr && isdigit(src[1])) + pdesc->width = pdesc->width * 10 + *++src - '0'; + break; + case '.': + if (src < end_ptr) + { + if (src[1] == '*') + { + CHECK_PAD('.', stringfunc_STAR_PRECISION); + src++; + } + else + { + /* + * when no one digit is entered, then precision + * is zero - digits are optional. + */ + CHECK_PAD('.', stringfunc_PRECISION); + while (src < end_ptr && isdigit(src[1])) + { + pdesc->precision = pdesc->precision * 10 + *++src - '0'; + } + } + } + else + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("broken sprintf format"), + errdetail("missing precision value"))); + break; + + default: + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("unsupported sprintf format tag '%.*s'", pg_mblen(src), src))); + } + } + + if (pdesc->field_type == '\0') + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("broken sprintf format"))); + + return src; + } + + static char * + currentFormat(StringInfo str, PlaceholderDesc pdesc) + { + resetStringInfo(str); + appendStringInfoChar(str,'%'); + + if (pdesc->flags & stringfunc_ZERO) + appendStringInfoChar(str, '0'); + + if (pdesc->flags & stringfunc_MINUS) + appendStringInfoChar(str, '-'); + + if (pdesc->flags & stringfunc_PLUS) + appendStringInfoChar(str, '+'); + + if (pdesc->flags & stringfunc_SPACE) + appendStringInfoChar(str, ' '); + + if (pdesc->flags & stringfunc_SHARP) + appendStringInfoChar(str, '#'); + + if ((pdesc->flags & stringfunc_WIDTH) || (pdesc->flags & stringfunc_STAR_WIDTH)) + appendStringInfoChar(str, '*'); + + if ((pdesc->flags & stringfunc_PRECISION) || (pdesc->flags & stringfunc_STAR_PRECISION)) + appendStringInfoString(str, ".*"); + + /* Append l or ll. Decision is based on value of INT64_FORMAT */ + if (pdesc->lenmod == 'l') + { + if (strcmp(INT64_FORMAT, "%lld") == 0) + appendStringInfoString(str, "ll"); + else + appendStringInfoString(str, "l"); + } + else if (pdesc->lenmod != '\0') + appendStringInfoChar(str, pdesc->lenmod); + + appendStringInfoChar(str, pdesc->field_type); + + return str->data; + } + + /* + * simulate %+width.precion%s format of sprintf function + */ + static void + append_string(StringInfo str, PlaceholderDesc pdesc, char *string) + { + int nchars = 0; /* length of substring in chars */ + int binlen = 0; /* length of substring in bytes */ + + /* + * apply precision - it means "show only first n chars", for strings - this flag is + * ignored for proprietary tags %lq and iq, because we can't to show a first n chars + * from possible quoted value. + */ + if (pdesc->flags & stringfunc_PRECISION && pdesc->field_type != 'q') + { + char *ptr = string; + int len = pdesc->precision; + + if (pg_database_encoding_max_length() > 1) + { + while (*ptr && len > 0) + { + ptr += pg_mblen(ptr); + len--; + nchars++; + } + } + else + { + while (*ptr && len > 0) + { + ptr++; + len--; + nchars++; + } + } + + binlen = ptr - string; + } + else + { + /* there isn't precion specified, show complete string */ + nchars = pg_mbstrlen(string); + binlen = strlen(string); + } + + /* when width is specified, then we have to solve left or right align */ + if (pdesc->flags & stringfunc_WIDTH) + { + if (pdesc->width > nchars) + { + /* add neccessary spaces to begin or end */ + if (pdesc->flags & stringfunc_MINUS) + { + /* allign to left */ + appendBinaryStringInfo(str, string, binlen); + appendStringInfoSpaces(str, pdesc->width - nchars); + } + else + { + /* allign to right */ + appendStringInfoSpaces(str, pdesc->width - nchars); + appendBinaryStringInfo(str, string, binlen); + } + + } + else + /* just copy result to output */ + appendBinaryStringInfo(str, string, binlen); + } + else + /* just copy result to output */ + appendBinaryStringInfo(str, string, binlen); + } + + /* + * Set width and precision when they are defined dynamicaly + */ + static + int setWidthAndPrecision(PlaceholderDesc pdesc, FunctionCallInfoData *fcinfo, int current) + { + + /* + * don't allow ambiguous definition + */ + if ((pdesc->flags & stringfunc_WIDTH) && (pdesc->flags & stringfunc_STAR_WIDTH)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("broken sprintf format"), + errdetail("ambiguous width definition"))); + + if ((pdesc->flags & stringfunc_PRECISION) && (pdesc->flags & stringfunc_STAR_PRECISION)) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("broken sprintf format"), + errdetail("ambiguous precision definition"))); + if (pdesc->flags & stringfunc_STAR_WIDTH) + { + if (current >= PG_NARGS()) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("too few parameters specified for printf function"))); + + if (PG_ARGISNULL(current)) + ereport(ERROR, + (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), + errmsg("null value not allowed"), + errhint("width (%dth) arguments is NULL", current))); + + pdesc->width = DatumGetInt32(castValueTo(PG_GETARG_DATUM(current), INT4OID, + get_fn_expr_argtype(fcinfo->flinfo, current))); + /* reset flag */ + pdesc->flags ^= stringfunc_STAR_WIDTH; + pdesc->flags |= stringfunc_WIDTH; + current += 1; + } + + if (pdesc->flags & stringfunc_STAR_PRECISION) + { + if (current >= PG_NARGS()) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("too few parameters specified for printf function"))); + + if (PG_ARGISNULL(current)) + ereport(ERROR, + (errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED), + errmsg("null value not allowed"), + errhint("width (%dth) arguments is NULL", current))); + + pdesc->precision = DatumGetInt32(castValueTo(PG_GETARG_DATUM(current), INT4OID, + get_fn_expr_argtype(fcinfo->flinfo, current))); + /* reset flags */ + pdesc->flags ^= stringfunc_STAR_PRECISION; + pdesc->flags |= stringfunc_PRECISION; + current += 1; + } + + return current; + } + + /* + * sprintf function - it is wrapper for libc vprintf function + * + * ensure PostgreSQL -> C casting + */ + Datum + stringfunc_sprintf(PG_FUNCTION_ARGS) + { + text *fmt; + StringInfoData str; + StringInfoData format_str; + char *cp; + int i = 1; + size_t len; + char *start_ptr, + *end_ptr; + FormatPlaceholderData pdesc; + text *result; + + Oid typoutput; + bool typIsVarlena; + Datum value; + Oid valtype; + + /* When format string is null, returns null */ + if (PG_ARGISNULL(0)) + PG_RETURN_NULL(); + + fmt = PG_GETARG_TEXT_PP(0); + len = VARSIZE_ANY_EXHDR(fmt); + start_ptr = VARDATA_ANY(fmt); + end_ptr = start_ptr + len - 1; + + initStringInfo(&str); + initStringInfo(&format_str); + + for (cp = start_ptr; cp <= end_ptr; cp++) + { + if (cp[0] == '%') + { + /* when cp is not pointer on last char, check %% */ + if (cp < end_ptr && cp[1] == '%') + { + appendStringInfoChar(&str, cp[1]); + cp++; + continue; + } + + cp = parsePlaceholder(cp, end_ptr, &pdesc, fmt); + i = setWidthAndPrecision(&pdesc, fcinfo, i); + + if (i >= PG_NARGS()) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("too few parameters specified for printf function"))); + + if (!PG_ARGISNULL(i)) + { + /* append n-th value */ + value = PG_GETARG_DATUM(i); + valtype = get_fn_expr_argtype(fcinfo->flinfo, i); + + /* convert value to target type */ + switch (pdesc.field_type) + { + case 'o': case 'd': case 'i': case 'x': case 'X': + { + int64 target_value; + const char *format; + + pdesc.lenmod = 'l'; + target_value = DatumGetInt64(castValueTo(value, INT8OID, valtype)); + format = currentFormat(&format_str, &pdesc); + + if ((pdesc.flags & stringfunc_WIDTH) && (pdesc.flags & stringfunc_PRECISION)) + appendStringInfo(&str, format, pdesc.width, pdesc.precision, target_value); + else if (pdesc.flags & stringfunc_WIDTH) + appendStringInfo(&str, format, pdesc.width, target_value); + else if (pdesc.flags & stringfunc_PRECISION) + appendStringInfo(&str, format, pdesc.precision, target_value); + else + appendStringInfo(&str, format, target_value); + } + break; + case 'e': case 'f': case 'g': case 'G': case 'E': + { + float8 target_value; + const char *format; + + target_value = DatumGetFloat8(castValueTo(value, FLOAT8OID, valtype)); + format = currentFormat(&format_str, &pdesc); + + if ((pdesc.flags & stringfunc_WIDTH) && (pdesc.flags & stringfunc_PRECISION)) + appendStringInfo(&str, format, pdesc.width, pdesc.precision, target_value); + else if (pdesc.flags & stringfunc_WIDTH) + appendStringInfo(&str, format, pdesc.width, target_value); + else if (pdesc.flags & stringfunc_PRECISION) + appendStringInfo(&str, format, pdesc.precision, target_value); + else + appendStringInfo(&str, format, target_value); + } + break; + case 's': + { + char *target_value; + + getTypeOutputInfo(valtype, &typoutput, &typIsVarlena); + target_value = OidOutputFunctionCall(typoutput, value); + + append_string(&str, &pdesc, target_value); + pfree(target_value); + } + break; + default: + /* don't be happen - formats are checked in parsing stage */ + elog(ERROR, "unknown format: %c", pdesc.field_type); + } + } + else + /* append a NULL string */ + append_string(&str, &pdesc, "<NULL>"); + i++; + } + else + appendStringInfoChar(&str, cp[0]); + } + + /* check if all arguments are used */ + if (i != PG_NARGS()) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("too many parameters for printf function"))); + result = cstring_to_text_with_len(str.data, str.len); + + pfree(str.data); + pfree(format_str.data); + + PG_RETURN_TEXT_P(result); + } + + /* + * only wrapper + */ + Datum + stringfunc_sprintf_nv(PG_FUNCTION_ARGS) + { + return stringfunc_sprintf(fcinfo); + } + + /* + * Substitute a positional parameters by value + */ + Datum + stringfunc_substitute(PG_FUNCTION_ARGS) + { + text *fmt; + StringInfoData str; + char *cp; + size_t len; + char *start_ptr; + char *end_ptr; + text *result; + ArrayType *array; + Oid elmtype; + int16 elmlen; + bool elmbyval; + char elmalign; + int num_elems = 0; + Datum *elem_values; + bool *elem_nulls; + + fmt = PG_GETARG_TEXT_PP(0); + len = VARSIZE_ANY_EXHDR(fmt); + start_ptr = VARDATA_ANY(fmt); + end_ptr = start_ptr + len - 1; + + if (PG_NARGS() == 2) + { + array = PG_GETARG_ARRAYTYPE_P(1); + elmtype = ARR_ELEMTYPE(array); + get_typlenbyvalalign(elmtype, &elmlen, &elmbyval, &elmalign); + + deconstruct_array(array, elmtype, + elmlen, elmbyval, elmalign, + &elem_values, &elem_nulls, + &num_elems); + } + + initStringInfo(&str); + for (cp = start_ptr; cp <= end_ptr; cp++) + { + /* + * there are allowed escape char - '\' + */ + if (cp[0] == '\\') + { + /* check next char */ + if (cp < end_ptr) + { + switch (cp[1]) + { + case '\\': + case '$': + appendStringInfoChar(&str, cp[1]); + break; + + default: + /* + * because unsupported symbols should be a multibyte chars, + * we cannot to use a %c formating. We have take a complete + * multibyte char length and show it via %.*s. + */ + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("unsuported escape sequence \"\\%.*s\"", pg_mblen(&cp[1]), &cp[1]))); + } + cp++; + } + else + elog(ERROR, "broken escape sequence"); + } + else if (cp[0] == '$') + { + long pos; + char *endptr; + + /* initial check */ + if (cp == end_ptr) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("missing a parameter position specification"))); + + if (!isdigit(cp[1])) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("expected a numeric value"))); + + pos = strtol(&cp[1], &endptr, 10); + cp = endptr - 1; + + if (pos < 1 || pos > num_elems) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("positional placeholder \"$%ld\" is not valid", pos))); + + if (!elem_nulls[pos - 1]) + appendStringInfoString(&str, text_to_cstring(DatumGetTextP(elem_values[pos - 1]))); + else + appendStringInfoString(&str, "<NULL>"); + } + else + appendStringInfoChar(&str, cp[0]); + } + + result = cstring_to_text_with_len(str.data, str.len); + pfree(str.data); + + PG_RETURN_TEXT_P(result); + } + + /* + * Non variadic text_substitute function - only wrapper + * Print and check format string + */ + Datum + stringfunc_substitute_nv(PG_FUNCTION_ARGS) + { + return stringfunc_substitute(fcinfo); + } *** ./contrib/stringfunc/stringfunc.sql.in.orig 2010-09-29 08:12:51.329397362 +0200 --- ./contrib/stringfunc/stringfunc.sql.in 2010-09-29 08:10:48.366270806 +0200 *************** *** 0 **** --- 1,25 ---- + /* $PostgreSQL: pgsql/contrib/stringfunc/stringfunc.sql.in,v 1.25 2009/06/11 18:30:03 tgl Exp $ */ + + -- Adjust this setting to control where the objects get created. + SET search_path = public; + + CREATE OR REPLACE FUNCTION sprintf(fmt text, VARIADIC args "any") + RETURNS text + AS '$libdir/stringfunc','stringfunc_sprintf' + LANGUAGE C STABLE; + + CREATE OR REPLACE FUNCTION sprintf(fmt text) + RETURNS text + AS '$libdir/stringfunc','stringfunc_sprintf_nv' + LANGUAGE C STABLE; + + CREATE OR REPLACE FUNCTION substitute(fmt text, VARIADIC args text[]) + RETURNS text + AS '$libdir/stringfunc','stringfunc_substitute' + LANGUAGE C STABLE; + + CREATE OR REPLACE FUNCTION substitute(fmt text) + RETURNS text + AS '$libdir/stringfunc','stringfunc_substitute_nv' + LANGUAGE C STABLE; + *** ./contrib/stringfunc/uninstall_stringfunc.sql.orig 2010-09-29 08:12:56.353273675 +0200 --- ./contrib/stringfunc/uninstall_stringfunc.sql 2010-09-29 08:10:48.366270806 +0200 *************** *** 0 **** --- 1,10 ---- + /* $PostgreSQL: pgsql/contrib/stringfunc/uninstall_stringfunc.sql,v 1.8 2008/04/14 17:05:32 tgl Exp $ */ + + -- Adjust this setting to control where the objects get dropped. + SET search_path = public; + + DROP FUNCTION sprintf(fmt text, VARIADIC args "any"); + DROP FUNCTION sprintf(fmt text); + DROP FUNCTION substitute(fmt text, VARIADIC args text[]); + DROP FUNCTION substitute(fmt text); + *** ./doc/src/sgml/contrib.sgml.orig 2010-09-29 08:10:11.529398663 +0200 --- ./doc/src/sgml/contrib.sgml 2010-09-29 08:10:48.367270236 +0200 *************** *** 115,120 **** --- 115,121 ---- &seg; &contrib-spi; &sslinfo; + &stringfunc; &tablefunc; &test-parser; &tsearch2; *** ./doc/src/sgml/filelist.sgml.orig 2010-09-29 08:10:11.530395997 +0200 --- ./doc/src/sgml/filelist.sgml 2010-09-29 08:10:48.367270236 +0200 *************** *** 127,132 **** --- 127,133 ---- <!entity seg SYSTEM "seg.sgml"> <!entity contrib-spi SYSTEM "contrib-spi.sgml"> <!entity sslinfo SYSTEM "sslinfo.sgml"> + <!entity stringfunc SYSTEM "stringfunc.sgml"> <!entity tablefunc SYSTEM "tablefunc.sgml"> <!entity test-parser SYSTEM "test-parser.sgml"> <!entity tsearch2 SYSTEM "tsearch2.sgml"> *** ./doc/src/sgml/stringfunc.sgml.orig 2010-09-29 08:42:33.899272275 +0200 --- ./doc/src/sgml/stringfunc.sgml 2010-09-29 09:04:00.493270789 +0200 *************** *** 0 **** --- 1,67 ---- + <!-- $PostgreSQL: pgsql/doc/src/sgml/stringfunc.sgml,v 1.2 2008/09/12 18:29:49 tgl Exp $ --> + + <sect1 id="stringfunc"> + <title>stringfunc</title> + + <indexterm zone="stringfunc"> + <primary>stringfunc</primary> + </indexterm> + + <para> + The <filename>stringfunc</> module provides a additional function + for operation over strings. These functions can be used as patter + for developing a variadic functions. + </para> + + <sect2> + <title>How to Use It</title> + + <para> + Here's a simple example of usage: + + <programlisting> + SELECT sprintf('formated number: %10d',10); + SELECT substitute('file '$1' doesn''t exists', '/var/log/applog'); + </programlisting> + </para> + </sect2> + + <sect2> + <title><filename>stringfunc</> Functions and Operators</title> + + <table id="stringfunc-func-table"> + <title><filename>stringfunc</> Functions</title> + + <tgroup cols="5"> + <thead> + <row> + <entry>Function</entry> + <entry>Return Type</entry> + <entry>Description</entry> + <entry>Example</entry> + <entry>Result</entry> + </row> + </thead> + + <tbody> + <row> + <entry><function>sprintf(formatstr [, params])</function></entry> + <entry><type>text</type></entry> + <entry>simplyfied version of libc sprintf function - it doesn't support + positional parameters and it will do necessary conversions automaticaly.</entry> + <entry><literal>sprintf('Hello %10s','World')</literal></entry> + <entry><literal>Hello World</literal></entry> + </row> + + <row> + <entry><function>substitute(formatstr [, params])</function></entry> + <entry><type>text</type></entry> + <entry>replace a positional placeholers like $n by params</entry> + <entry><literal>substitute('$1,$2,$2,$1','A','B');</literal></entry> + <entry><literal>A,B,B,A</literal></entry> + </row> + </tbody> + </tgroup> + </table> + </sect2> + </sect1>
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers