On 17 February 2016 at 00:39, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote: > On 2/16/16, Vitaly Burovoy <vitaly.buro...@gmail.com> wrote: >> On 2/16/16, Dean Rasheed <dean.a.rash...@gmail.com> wrote: >>> Fixing that in parse_memory_unit() would be messy because it assumes a >>> base unit of kB, so it would require a negative multiplier, and >>> pg_size_bytes() would have to be taught to divide by the magnitude of >>> negative multipliers in the same way that guc.c does. > > Now parse_memory_unit returns -1024 for bytes as divider, constant > "bytes" has moved there. > Add new memory_units_bytes_hint which differs from an original > memory_units_int by "bytes" size unit. > Allow hintmsg be NULL and if so, skip setting dereferenced variable to > memory_units_bytes_hint. >
I think that approach is getting more and more unwieldy, and it simply isn't worth the effort just to share a few values from the unit conversion table, especially given that the set of supported units differs between the two places. >>> ISTM that it would be far less code, and much simpler and more >>> readable to just parse the supported units directly in >>> pg_size_bytes(), rather than trying to share code with guc.c, when the >>> supported units are actually different and may well diverge further in >>> the future. >> I've gone with this approach and it is indeed far less code, and much simpler and easier to read. This will also make it easier to maintain/extend in the future. I've made a few minor tweaks to the docs, and added a note to make it clear that the units in these functions work in powers of 2 not 10. I also took the opportunity to tidy up the number scanning code somewhat (I was tempted to rip it out entirely, since it feels awfully close to duplicating the numeric code, but it's probably worth it for the better error message). Additionally, note that I replaced strcasecmp() with pg_strcasecmp(), since AIUI the former is not available on all supported platforms. Barring objections, and subject to some more testing, I intend to commit this version. Regards, Dean
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml new file mode 100644 index f9eea76..60f117a --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -17767,6 +17767,9 @@ postgres=# SELECT * FROM pg_xlogfile_nam <primary>pg_relation_size</primary> </indexterm> <indexterm> + <primary>pg_size_bytes</primary> + </indexterm> + <indexterm> <primary>pg_size_pretty</primary> </indexterm> <indexterm> @@ -17838,6 +17841,15 @@ postgres=# SELECT * FROM pg_xlogfile_nam </row> <row> <entry> + <literal><function>pg_size_bytes(<type>text</type>)</function></literal> + </entry> + <entry><type>bigint</type></entry> + <entry> + Converts a size in human-readable format with size units into bytes + </entry> + </row> + <row> + <entry> <literal><function>pg_size_pretty(<type>bigint</type>)</function></literal> </entry> <entry><type>text</type></entry> @@ -17968,11 +17980,27 @@ postgres=# SELECT * FROM pg_xlogfile_nam <para> <function>pg_size_pretty</> can be used to format the result of one of - the other functions in a human-readable way, using kB, MB, GB or TB as - appropriate. + the other functions in a human-readable way, using bytes, kB, MB, GB or TB + as appropriate. </para> <para> + <function>pg_size_bytes</> can be used to get the size in bytes from a + string in human-readable format. The input may have units of bytes, kB, + MB, GB or TB, and is parsed case-insensitively. If no units are specified, + bytes are assumed. + </para> + + <note> + <para> + The units kB, MB, GB and TB used by the functions + <function>pg_size_pretty</> and <function>pg_size_bytes</> are defined + using powers of 2 rather than powers of 10, so 1kB is 1024 bytes, 1MB is + 1024<superscript>2</> = 1048576 bytes, and so on. + </para> + </note> + + <para> The functions above that operate on tables or indexes accept a <type>regclass</> argument, which is simply the OID of the table or index in the <structname>pg_class</> system catalog. You do not have to look up diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c new file mode 100644 index 2084692..91260cd --- a/src/backend/utils/adt/dbsize.c +++ b/src/backend/utils/adt/dbsize.c @@ -700,6 +700,145 @@ pg_size_pretty_numeric(PG_FUNCTION_ARGS) } /* + * Convert a human-readable size to a size in bytes + */ +Datum +pg_size_bytes(PG_FUNCTION_ARGS) +{ + text *arg = PG_GETARG_TEXT_PP(0); + char *str, + *strptr, + *endptr; + bool have_digits; + char saved_char; + Numeric num; + int64 result; + + str = text_to_cstring(arg); + + /* Skip leading whitespace */ + strptr = str; + while (isspace((unsigned char) *strptr)) + strptr++; + + /* Check that we have a valid number and determine where it ends */ + endptr = strptr; + + /* Part (1): sign */ + if (*endptr == '-' || *endptr == '+') + endptr++; + + /* Part (2): main digit string */ + if (isdigit((unsigned char) *endptr)) + { + have_digits = true; + do + endptr++; + while (isdigit((unsigned char) *endptr)); + } + else + have_digits = false; + + /* Part (3): optional decimal point and fractional digits */ + if (*endptr == '.') + { + endptr++; + if (isdigit((unsigned char) *endptr)) + { + have_digits = true; + do + endptr++; + while (isdigit((unsigned char) *endptr)); + } + } + + /* Complain if we don't have a valid number at this point */ + if (!have_digits) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid size: \"%s\"", str))); + + /* Part (4): optional exponent */ + if (*endptr == 'e' || *endptr == 'E') + { + char *cp; + + (void) strtol(endptr + 1, &cp, 10); + if (cp > endptr + 1) + endptr = cp; + } + + /* + * Parse the number, saving the next character, which may be the first + * character of the unit string. + */ + saved_char = *endptr; + *endptr = '\0'; + + num = DatumGetNumeric(DirectFunctionCall3(numeric_in, + CStringGetDatum(strptr), + ObjectIdGetDatum(InvalidOid), + Int32GetDatum(-1))); + + *endptr = saved_char; + + /* Skip whitespace between number and unit */ + strptr = endptr; + while (isspace((unsigned char) *strptr)) + strptr++; + + /* Handle possible unit */ + if (*strptr != '\0') + { + int64 multiplier = 0; + + /* Trim any trailing whitespace */ + endptr = str + VARSIZE_ANY_EXHDR(arg) - 1; + + while (isspace((unsigned char) *endptr)) + endptr--; + + endptr++; + *endptr = '\0'; + + /* Parse the unit case-insensitively */ + if (pg_strcasecmp(strptr, "bytes") == 0) + multiplier = 1; + else if (pg_strcasecmp(strptr, "kb") == 0) + multiplier = 1024; + else if (pg_strcasecmp(strptr, "mb") == 0) + multiplier = 1024 * 1024; + else if (pg_strcasecmp(strptr, "gb") == 0) + multiplier = 1024 * 1024 * 1024; + else if (pg_strcasecmp(strptr, "tb") == 0) + multiplier = 1024 * 1024 * 1024 * 1024L; + else + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("invalid size: \"%s\"", text_to_cstring(arg)), + errdetail("Invalid size unit: \"%s\".", strptr), + errhint("Valid units are \"bytes\", \"kB\", \"MB\", \"GB\", and \"TB\"."))); + + if (multiplier > 1) + { + Numeric mul_num; + + mul_num = DatumGetNumeric(DirectFunctionCall1(int8_numeric, + Int64GetDatum(multiplier))); + + num = DatumGetNumeric(DirectFunctionCall2(numeric_mul, + NumericGetDatum(mul_num), + NumericGetDatum(num))); + } + } + + result = DatumGetInt64(DirectFunctionCall1(numeric_int8, + NumericGetDatum(num))); + + PG_RETURN_INT64(result); +} + +/* * Get the filenode of a relation * * This is expected to be used in queries like diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h new file mode 100644 index 1c0ef9a..f5580b1 --- a/src/include/catalog/pg_proc.h +++ b/src/include/catalog/pg_proc.h @@ -3601,6 +3601,8 @@ DATA(insert OID = 2288 ( pg_size_pretty DESCR("convert a long int to a human readable text using size units"); DATA(insert OID = 3166 ( pg_size_pretty PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 25 "1700" _null_ _null_ _null_ _null_ _null_ pg_size_pretty_numeric _null_ _null_ _null_ )); DESCR("convert a numeric to a human readable text using size units"); +DATA(insert OID = 3334 ( pg_size_bytes PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 20 "25" _null_ _null_ _null_ _null_ _null_ pg_size_bytes _null_ _null_ _null_ )); +DESCR("convert a size in human-readable format with size units into bytes"); DATA(insert OID = 2997 ( pg_table_size PGNSP PGUID 12 1 0 0 0 f f f f t f v s 1 0 20 "2205" _null_ _null_ _null_ _null_ _null_ pg_table_size _null_ _null_ _null_ )); DESCR("disk space usage for the specified table, including TOAST, free space and visibility map"); DATA(insert OID = 2998 ( pg_indexes_size PGNSP PGUID 12 1 0 0 0 f f f f t f v s 1 0 20 "2205" _null_ _null_ _null_ _null_ _null_ pg_indexes_size _null_ _null_ _null_ )); diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h new file mode 100644 index affcc01..feaf2bd --- a/src/include/utils/builtins.h +++ b/src/include/utils/builtins.h @@ -473,6 +473,7 @@ extern Datum pg_relation_size(PG_FUNCTIO extern Datum pg_total_relation_size(PG_FUNCTION_ARGS); extern Datum pg_size_pretty(PG_FUNCTION_ARGS); extern Datum pg_size_pretty_numeric(PG_FUNCTION_ARGS); +extern Datum pg_size_bytes(PG_FUNCTION_ARGS); extern Datum pg_table_size(PG_FUNCTION_ARGS); extern Datum pg_indexes_size(PG_FUNCTION_ARGS); extern Datum pg_relation_filenode(PG_FUNCTION_ARGS); diff --git a/src/test/regress/expected/dbsize.out b/src/test/regress/expected/dbsize.out new file mode 100644 index aa513e7..ef5b41a --- a/src/test/regress/expected/dbsize.out +++ b/src/test/regress/expected/dbsize.out @@ -35,3 +35,108 @@ SELECT size, pg_size_pretty(size), pg_si 1000000000000000.5 | 909 TB | -909 TB (12 rows) +SELECT pg_size_bytes(size) FROM + (VALUES ('1'), ('123bytes'), ('1kB'), ('1MB'), (' 1 GB'), ('1.5 GB '), + ('1TB'), ('3000 TB'), ('1e6 MB')) x(size); + pg_size_bytes +------------------ + 1 + 123 + 1024 + 1048576 + 1073741824 + 1610612736 + 1099511627776 + 3298534883328000 + 1048576000000 +(9 rows) + +-- case-insensitive units are supported +SELECT pg_size_bytes(size) FROM + (VALUES ('1'), ('123bYteS'), ('1kb'), ('1mb'), (' 1 Gb'), ('1.5 gB '), + ('1tb'), ('3000 tb'), ('1e6 mb')) x(size); + pg_size_bytes +------------------ + 1 + 123 + 1024 + 1048576 + 1073741824 + 1610612736 + 1099511627776 + 3298534883328000 + 1048576000000 +(9 rows) + +-- negative numbers are supported +SELECT pg_size_bytes(size) FROM + (VALUES ('-1'), ('-123bytes'), ('-1kb'), ('-1mb'), (' -1 Gb'), ('-1.5 gB '), + ('-1tb'), ('-3000 TB'), ('-10e-1 MB')) x(size); + pg_size_bytes +------------------- + -1 + -123 + -1024 + -1048576 + -1073741824 + -1610612736 + -1099511627776 + -3298534883328000 + -1048576 +(9 rows) + +-- different cases with allowed points +SELECT pg_size_bytes(size) FROM + (VALUES ('-1.'), ('-1.kb'), ('-1. kb'), ('-0. gb'), + ('-.1'), ('-.1kb'), ('-.1 kb'), ('-.0 gb')) x(size); + pg_size_bytes +--------------- + -1 + -1024 + -1024 + 0 + 0 + -102 + -102 + 0 +(8 rows) + +-- invalid inputs +SELECT pg_size_bytes('1 AB'); +ERROR: invalid size: "1 AB" +DETAIL: Invalid size unit: "AB". +HINT: Valid units are "bytes", "kB", "MB", "GB", and "TB". +SELECT pg_size_bytes('1 AB A'); +ERROR: invalid size: "1 AB A" +DETAIL: Invalid size unit: "AB A". +HINT: Valid units are "bytes", "kB", "MB", "GB", and "TB". +SELECT pg_size_bytes('1 AB A '); +ERROR: invalid size: "1 AB A " +DETAIL: Invalid size unit: "AB A". +HINT: Valid units are "bytes", "kB", "MB", "GB", and "TB". +SELECT pg_size_bytes('9223372036854775807.9'); +ERROR: bigint out of range +SELECT pg_size_bytes('1 byte'); -- the singular "byte" is not supported +ERROR: invalid size: "1 byte" +DETAIL: Invalid size unit: "byte". +HINT: Valid units are "bytes", "kB", "MB", "GB", and "TB". +SELECT pg_size_bytes(''); +ERROR: invalid size: "" +SELECT pg_size_bytes('kb'); +ERROR: invalid size: "kb" +SELECT pg_size_bytes('..'); +ERROR: invalid size: ".." +SELECT pg_size_bytes('-.'); +ERROR: invalid size: "-." +SELECT pg_size_bytes('-.kb'); +ERROR: invalid size: "-.kb" +SELECT pg_size_bytes('-. kb'); +ERROR: invalid size: "-. kb" +SELECT pg_size_bytes('.+912'); +ERROR: invalid size: ".+912" +SELECT pg_size_bytes('+912+ kB'); +ERROR: invalid size: "+912+ kB" +DETAIL: Invalid size unit: "+ kB". +HINT: Valid units are "bytes", "kB", "MB", "GB", and "TB". +SELECT pg_size_bytes('++123 kB'); +ERROR: invalid size: "++123 kB" diff --git a/src/test/regress/sql/dbsize.sql b/src/test/regress/sql/dbsize.sql new file mode 100644 index c118090..cd999ed --- a/src/test/regress/sql/dbsize.sql +++ b/src/test/regress/sql/dbsize.sql @@ -10,3 +10,40 @@ SELECT size, pg_size_pretty(size), pg_si (10.5::numeric), (1000.5::numeric), (1000000.5::numeric), (1000000000.5::numeric), (1000000000000.5::numeric), (1000000000000000.5::numeric)) x(size); + +SELECT pg_size_bytes(size) FROM + (VALUES ('1'), ('123bytes'), ('1kB'), ('1MB'), (' 1 GB'), ('1.5 GB '), + ('1TB'), ('3000 TB'), ('1e6 MB')) x(size); + +-- case-insensitive units are supported +SELECT pg_size_bytes(size) FROM + (VALUES ('1'), ('123bYteS'), ('1kb'), ('1mb'), (' 1 Gb'), ('1.5 gB '), + ('1tb'), ('3000 tb'), ('1e6 mb')) x(size); + +-- negative numbers are supported +SELECT pg_size_bytes(size) FROM + (VALUES ('-1'), ('-123bytes'), ('-1kb'), ('-1mb'), (' -1 Gb'), ('-1.5 gB '), + ('-1tb'), ('-3000 TB'), ('-10e-1 MB')) x(size); + +-- different cases with allowed points +SELECT pg_size_bytes(size) FROM + (VALUES ('-1.'), ('-1.kb'), ('-1. kb'), ('-0. gb'), + ('-.1'), ('-.1kb'), ('-.1 kb'), ('-.0 gb')) x(size); + +-- invalid inputs +SELECT pg_size_bytes('1 AB'); +SELECT pg_size_bytes('1 AB A'); +SELECT pg_size_bytes('1 AB A '); +SELECT pg_size_bytes('9223372036854775807.9'); +SELECT pg_size_bytes('1 byte'); -- the singular "byte" is not supported +SELECT pg_size_bytes(''); + +SELECT pg_size_bytes('kb'); +SELECT pg_size_bytes('..'); +SELECT pg_size_bytes('-.'); +SELECT pg_size_bytes('-.kb'); +SELECT pg_size_bytes('-. kb'); + +SELECT pg_size_bytes('.+912'); +SELECT pg_size_bytes('+912+ kB'); +SELECT pg_size_bytes('++123 kB');
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers