This is an automated email from the ASF dual-hosted git repository. smiklosovic pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git
The following commit(s) were added to refs/heads/trunk by this push: new 12c061b3bb Updated documentation with additional function examples 12c061b3bb is described below commit 12c061b3bb61d9c7c3699ca9a160e08760dd60d4 Author: Arra, Praveen R <praveen.r.a...@jpmchase.com> AuthorDate: Mon Mar 17 11:32:25 2025 -0500 Updated documentation with additional function examples patch by Arra Praveen; reviewed by Stefan Miklosovic for CASSANDRA-20254 --- .../cassandra/examples/BNF/insert_statement.bnf | 3 +- .../cassandra/examples/CQL/update_statement.cql | 6 +- .../pages/developing/cql/definitions.adoc | 1 + .../cassandra/pages/developing/cql/dml.adoc | 61 +++++------- .../cassandra/pages/developing/cql/functions.adoc | 106 ++++++++++++--------- .../cassandra/pages/developing/cql/types.adoc | 2 + .../cassandra/partials/masking_functions.adoc | 28 +++--- 7 files changed, 110 insertions(+), 97 deletions(-) diff --git a/doc/modules/cassandra/examples/BNF/insert_statement.bnf b/doc/modules/cassandra/examples/BNF/insert_statement.bnf index ed80c3ed05..514af382ef 100644 --- a/doc/modules/cassandra/examples/BNF/insert_statement.bnf +++ b/doc/modules/cassandra/examples/BNF/insert_statement.bnf @@ -1,6 +1,7 @@ insert_statement::= INSERT INTO table_name ( names_values | json_clause ) [ IF NOT EXISTS ] - [ USING update_parameter ( AND update_parameter )* ] + [ USING insert_parameter ( AND insert_parameter )* ] names_values::= names VALUES tuple_literal json_clause::= JSON string [ DEFAULT ( NULL | UNSET ) ] names::= '(' column_name ( ',' column_name )* ')' +insert_parameter ::= ( TIMESTAMP | TTL ) ( integer | bind_marker ) diff --git a/doc/modules/cassandra/examples/CQL/update_statement.cql b/doc/modules/cassandra/examples/CQL/update_statement.cql index 7e1cfa76fe..c3b4edca0e 100644 --- a/doc/modules/cassandra/examples/CQL/update_statement.cql +++ b/doc/modules/cassandra/examples/CQL/update_statement.cql @@ -1,10 +1,14 @@ +UPDATE NerdMovies +SET director = 'Joss Whedon', +WHERE movie = 'Serenity'; + UPDATE NerdMovies USING TTL 400 SET director = 'Joss Whedon', main_actor = 'Nathan Fillion', year = 2005 WHERE movie = 'Serenity'; -UPDATE UserActions +UPDATE UserActions USING TIMESTAMP 1735689600 SET total = total + 2 WHERE user = B70DE1D0-9908-4AE3-BE34-5573E5B09F14 AND action = 'click'; diff --git a/doc/modules/cassandra/pages/developing/cql/definitions.adoc b/doc/modules/cassandra/pages/developing/cql/definitions.adoc index 9d494c85b4..e139c28ef9 100644 --- a/doc/modules/cassandra/pages/developing/cql/definitions.adoc +++ b/doc/modules/cassandra/pages/developing/cql/definitions.adoc @@ -172,6 +172,7 @@ this documentation (see links above): include::cassandra:example$BNF/cql_statement.bnf[] ---- +[[prepared-statements]] == Prepared Statements CQL supports _prepared statements_. Prepared statements are an diff --git a/doc/modules/cassandra/pages/developing/cql/dml.adoc b/doc/modules/cassandra/pages/developing/cql/dml.adoc index 674ede8145..66981129cd 100644 --- a/doc/modules/cassandra/pages/developing/cql/dml.adoc +++ b/doc/modules/cassandra/pages/developing/cql/dml.adoc @@ -72,25 +72,11 @@ include::cassandra:example$CQL/as.cql[] [NOTE] ==== Currently, aliases aren't recognized in the `WHERE` or `ORDER BY` clauses in the statement. -You must use the orignal column name instead. +You must use the original column name instead. ==== -[[writetime-and-ttl-function]] -==== `WRITETIME`, `MAXWRITETIME` and `TTL` function - -Selection supports three special functions that aren't allowed anywhere -else: `WRITETIME`, `MAXWRITETIME` and `TTL`. -All functions take only one argument, a column name. If the column is a collection or UDT, it's possible to add element -selectors, such as `WRITETTIME(phones[2..4])` or `WRITETTIME(user.name)`. -These functions retrieve meta-information that is stored internally for each column: - -* `WRITETIME` stores the timestamp of the value of the column. -* `MAXWRITETIME` stores the largest timestamp of the value of the column. For non-collection and non-UDT columns, `MAXWRITETIME` -is equivalent to `WRITETIME`. In the other cases, it returns the largest timestamp of the values in the column. -* `TTL` stores the remaining time to live (in seconds) for the value of the column if it is set to expire; otherwise the value is `null`. - -The `WRITETIME` and `TTL` functions can be used on multi-cell columns such as non-frozen collections or non-frozen -user-defined types. In that case, the functions will return the list of timestamps or TTLs for each selected cell. +Selection supports four special functions: `WRITETIME`, `MINWRITETIME`, `MAXWRITETIME` and `TTL`. +See the xref:cassandra:developing/cql/functions.adoc#writetime-and-ttl-functions[functions] section for details. [[where-clause]] === The `WHERE` clause @@ -120,7 +106,7 @@ include::cassandra:example$CQL/where.cql[] ---- But the following one is not, as it does not select a contiguous set of -rows (and we suppose no secondary indexes are set): +rows (assuming no secondary indexes): [source,cql] ---- @@ -133,7 +119,7 @@ Rows will be selected based on the token of the `PARTITION_KEY` rather than on t ==== The token of a key depends on the partitioner in use, and that in particular the `RandomPartitioner` won't yield a meaningful order. -Also note that ordering partitioners always order token values by bytes (so +Also note that the `ByteOrderedPartitioner` always orders token values by bytes (so even if the partition key is of type int, `token(-1) > token(0)` in particular). ==== @@ -216,7 +202,7 @@ or the reverse The `LIMIT` option to a `SELECT` statement limits the number of rows returned by a query. The `PER PARTITION LIMIT` option limits the -number of rows returned for a given partition by the query. Both types of limits can used in the same statement. +number of rows returned for a given partition by the query. Both types of limits can be used in the same statement. [[allow-filtering]] === Allowing filtering @@ -246,7 +232,7 @@ The first query returns all rows, because all users are selected. The second query returns only the rows defined by the secondary index, a per-node implementation; the results will depend on the number of nodes in the cluster, and is indirectly proportional to the amount of data stored. The number of nodes will always be multiple number of magnitude lower than the number of user profiles stored. -Both queries may return very large result sets, but the addition of a `LIMIT` clause can reduced the latency. +Both queries may return very large result sets, but the addition of a `LIMIT` clause would reduce the latency. The following query will be rejected: @@ -283,10 +269,10 @@ include::cassandra:example$CQL/insert_statement.cql[] The `INSERT` statement writes one or more columns for a given row in a table. -Since a row is identified by its `PRIMARY KEY`, at least one columns must be specified. +Since a row is identified by its `PRIMARY KEY`, at least one column must be specified. The list of columns to insert must be supplied with the `VALUES` syntax. When using the `JSON` syntax, `VALUES` are optional. -See the section on xref:cassandra:developing/cql/dml.adoc#cql-json[JSON support] for more detail. +See the section on xref:cassandra:developing/cql/json.adoc[JSON support] for more detail. All updates for an `INSERT` are applied atomically and in isolation. Unlike in SQL, `INSERT` does not check the prior existence of the row by default. @@ -297,7 +283,9 @@ The `IF NOT EXISTS` condition can restrict the insertion if the row does not exi However, note that using `IF NOT EXISTS` will incur a non-negligible performance cost, because Paxos is used, so this should be used sparingly. -Please refer to the xref:cassandra:developing/cql/dml.adoc#update-parameters[UPDATE] section for informations on the `update_parameter`. +When using xref:cassandra:developing/cql/definitions.adoc#prepared-statements[Prepared Statements] bind_markers can be used instead of actual values. + +Please refer to the xref:cassandra:developing/cql/dml.adoc#upsert-parameters[INSERT PARAMETERS] section for information on the `insert_parameter`. Also note that `INSERT` does not support counters, while `UPDATE` does. [[update-statement]] @@ -310,13 +298,6 @@ Updating a row is done using an `UPDATE` statement: include::cassandra:example$BNF/update_statement.bnf[] ---- -For instance: - -[source,cql] ----- -include::cassandra:example$CQL/update_statement.cql[] ----- - The `UPDATE` statement writes one or more columns for a given row in a table. The `WHERE` clause is used to select the row to update and must include all columns of the `PRIMARY KEY`. @@ -333,15 +314,15 @@ However, like the `IF NOT EXISTS` condition, a non-negligible performance cost c Regarding the `SET` assignment: * `c = c + 3` will increment/decrement counters, the only operation allowed. -The column name after the '=' sign *must* be the same than the one before the '=' sign. +The column name after the '=' sign *must* be the same as the one before the '=' sign. Increment/decrement is only allowed on counters. -See the section on xref:cassandra:developing/cql/dml.adoc#counters[counters] for details. +See the section on xref:cassandra:developing/cql/counter-column.adoc[counters] for details. * `id = id + <some-collection>` and `id[value1] = value2` are for collections. See the xref:cassandra:developing/cql/types.adoc#collections[collections] for details. * `id.field = 3` is for setting the value of a field on a non-frozen user-defined types. See the xref:cassandra:developing/cql/types.adoc#udts[UDTs] for details. -=== Update parameters +=== [[upsert-parameters]]Insert and Update parameters `UPDATE` and `INSERT` statements support the following parameters: @@ -362,6 +343,14 @@ the coordinator will use the current time (in microseconds) at the start of statement execution as the timestamp. This is usually a suitable default. +For instance: + +[source,cql] +---- +include::cassandra:example$CQL/update_statement.cql[] +---- + + [[delete_statement]] == DELETE @@ -389,7 +378,7 @@ may be deleted with one statement by using an `IN` operator. A range of rows may be deleted using an inequality operator (such as `>=`). `DELETE` supports the `TIMESTAMP` option with the same semantics as in -xref:cassandra:developing/cql/dml.adoc#update-parameters[updates]. +xref:cassandra:developing/cql/dml.adoc#upsert-parameters[updates]. In a `DELETE` statement, all deletions within the same partition key are applied atomically and in isolation. @@ -455,7 +444,7 @@ only isolated within a single partition). There is a performance penalty for batch atomicity when a batch spans multiple partitions. If you do not want to incur this penalty, you can -tell Cassandra to skip the batchlog with the `UNLOGGED` option. If the +tell Cassandra to skip the batch log with the `UNLOGGED` option. If the `UNLOGGED` option is used, a failed batch might leave the batch only partly applied. diff --git a/doc/modules/cassandra/pages/developing/cql/functions.adoc b/doc/modules/cassandra/pages/developing/cql/functions.adoc index ec8cd08527..43b95257ee 100644 --- a/doc/modules/cassandra/pages/developing/cql/functions.adoc +++ b/doc/modules/cassandra/pages/developing/cql/functions.adoc @@ -1,14 +1,12 @@ -// Need some intro for UDF and native functions in general and point those to it. -// [[cql-functions]][[native-functions]] +[[cql-functions]] = Functions CQL supports 2 main categories of functions: -* xref:cassandra:developing/cql/functions.adoc#scalar-functions[scalar functions] that take a number of values and produce an output +* xref:cassandra:developing/cql/functions.adoc#scalar-native-functions[scalar functions] that take a number of values and produce an output * xref:cassandra:developing/cql/functions.adoc#aggregate-functions[aggregate functions] that aggregate multiple rows resulting from a `SELECT` statement -In both cases, CQL provides a number of native "hard-coded" functions as -well as the ability to create new user-defined functions. +In both cases, CQL provides a number of native "hard-coded" functions and also allows for the creation of custom user-defined functions. [NOTE] ==== @@ -92,9 +90,9 @@ into its own datatype. The conversions rely strictly on Java's semantics. For example, the double value 1 will be converted to the text value '1.0'. For instance: -[source,cql] +[source,sql] ---- -SELECT avg(cast(count as double)) FROM myTable +SELECT avg(cast(count as double)) FROM myTable; ---- ==== Token @@ -114,7 +112,7 @@ The type of the arguments of the `token` depend on the partition key column type For example, consider the following table: -[source,cql] +[source,sql] ---- include::cassandra:example$CQL/create_table_simple.cql[] ---- @@ -134,12 +132,12 @@ uuid suitable for use in `INSERT` or `UPDATE` statements. The `now` function takes no arguments and generates, on the coordinator node, a new unique timeuuid at the time the function is invoked. Note -that this method is useful for insertion but is largely non-sensical in +that this method is useful for insertion but is largely nonsensical in `WHERE` clauses. For example, a query of the form: -[source,cql] +[source,sql] ---- include::cassandra:example$CQL/timeuuid_now.cql[] ---- @@ -157,7 +155,7 @@ The `max_timeuuid` works similarly, but returns the _largest_ possible `timeuuid For example: -[source,cql] +[source,sql] ---- include::cassandra:example$CQL/timeuuid_min_max.cql[] ---- @@ -167,7 +165,7 @@ The clause `t >= maxTimeuuid('2013-01-01 00:05+0000')` would still _not_ select [NOTE] ==== -The values generated by `min_timeuuid` and `max_timeuuid` are called _fake_ UUID because they do no respect the time-based UUID generation process +The values generated by `min_timeuuid` and `max_timeuuid` are called _fake_ UUIDs because they do not respect the time-based UUID generation process specified by the http://www.ietf.org/rfc/rfc4122.txt[IETF RFC 4122]. In particular, the value returned by these two methods will not be unique. Thus, only use these methods for *querying*, not for *insertion*, to prevent possible data overwriting. @@ -175,10 +173,7 @@ Thus, only use these methods for *querying*, not for *insertion*, to prevent pos ==== Datetime functions -===== Retrieving the current date/time - -The following functions can be used to retrieve the date/time at the -time where the function is invoked: +Retrieving the current date and time: [cols=",",options="header",] |=== @@ -195,7 +190,7 @@ time where the function is invoked: For example the last two days of data can be retrieved using: -[source,cql] +[source,sql] ---- include::cassandra:example$CQL/current_date.cql[] ---- @@ -227,7 +222,7 @@ A number of functions are provided to convert a `timeuuid`, a `timestamp` or a ` A number of functions are provided to convert the native types into binary data, or a `blob`. -For every xref:cassandra:developing/cql/types.adoc#native-types[type] supported by CQL, the function `type_as_blob` takes a argument of type `type` and returns it as a `blob`. +For every xref:cassandra:developing/cql/types.adoc#native-types[type] supported by CQL, the function `type_as_blob` takes an argument of type `type` and returns it as a `blob`. Conversely, the function `blob_as_type` takes a 64-bit `blob` argument and converts it to a `bigint` value. For example, `bigint_as_blob(3)` returns `0x0000000000000003` and `blob_as_bigint(0x0000000000000003)` returns `3`. @@ -339,7 +334,7 @@ include::cassandra:partial$vector-search/vector_functions.adoc[] [[human-helper-functions]] ==== Human helper functions -For user's convenience, there are currently two functions which are converting values to more human-friendly represetations. +For user's convenience, there are currently two functions which are converting values to more human-friendly representations. [cols=",,",options="header",] |=== @@ -369,7 +364,7 @@ The actual return value of the `Long.MAX_VALUE` will be 9223372036854776000 due There are three ways how to call this function. Let's have this table: -[source,cql] +[source,sql] ---- cqlsh> select * from ks.tb; @@ -387,12 +382,12 @@ cqlsh> select * from ks.tb; with schema -[source,cql] +[source,sql] ---- CREATE TABLE ks.tb ( id int PRIMARY KEY, val bigint -) +); ---- Imagine that we wanted to look at `val` values as if they were in mebibytes. We would like to have more human-friendly output in order to not visually divide the values by 1024 in order to get them in respective bigger units. The following function call may take just a column itself as an argument, and it will @@ -403,7 +398,7 @@ automatically convert it. The default source unit for `format_bytes` function is _bytes_, (`B`). ==== -[source,cql] +[source,sql] ---- cqlsh> select format_bytes(val) from ks.tb; @@ -421,7 +416,7 @@ cqlsh> select format_bytes(val) from ks.tb; The second way to call `format_bytes` functions is to specify into what size unit we would like to see all values to be converted to. For example, we want all size to be represented in mebibytes, hence we do: -[source,cql] +[source,sql] ---- cqlsh> select format_bytes(val, 'MiB') from ks.tb; @@ -437,9 +432,9 @@ cqlsh> select format_bytes(val, 'MiB') from ks.tb; ---- Lastly, we can specify a source unit and a target unit. A source unit tells what unit that column is logically of, the target unit tells what unit we want these values to be converted to. For example, -if we know that our column is logically in kibibytes and we want them to be converted into mebibytes, we would do: +if our column values are in kibibytes and we want to convert them to mebibytes, we would perform the following conversion: -[source,cql] +[source,sql] ---- cqlsh> select format_bytes(val, 'Kib', 'MiB') from ks.tb; @@ -470,7 +465,7 @@ Return values can be max of `Double.MAX_VALUE`, If the conversion produces overf The default source unit for `format_time` function is _milliseconds_, (`ms`). ==== -[source,cql] +[source,sql] ---- cqlsh> select format_time(val) from ks.tb; @@ -485,9 +480,9 @@ cqlsh> select format_time(val) from ks.tb; 2.06 m ---- -We may specify what unit we want that value to be converted to, give the column's values are in millseconds: +We may specify what unit we want that value to be converted to, give the column's values are in milliseconds: -[source,cql] +[source,sql] ---- cqlsh> select format_time(val, 'm') from ks.tb; @@ -504,7 +499,7 @@ cqlsh> select format_time(val, 'm') from ks.tb; Lastly, we can specify both source and target values: -[source,cql] +[source,sql] ---- cqlsh> select format_time(val, 's', 'h') from ks.tb; @@ -537,13 +532,13 @@ already in progress. For more information - CASSANDRA-17281, CASSANDRA-18252. For example: -[source,cql] +[source,sql] ---- include::cassandra:example$CQL/function_overload.cql[] ---- -UDFs are susceptible to all of the normal problems with the chosen programming language. -Accordingly, implementations should be safe against null pointer exceptions, illegal arguments, or any other potential source of exceptions. +User-defined functions (UDFs) are prone to the typical issues associated with the programming language they are written in. +Accordingly, implementations should be protected against null pointer exceptions, illegal arguments, or any other potential source of exceptions. An exception during function execution will result in the entire statement failing. Valid queries for UDF use are `SELECT`, `INSERT` and `UPDATE` statements. @@ -558,14 +553,14 @@ Note the use the double dollar-sign syntax to enclose the UDF source code. For example: -[source,cql] +[source,sql] ---- include::cassandra:example$CQL/function_dollarsign.cql[] ---- The implicitly available `udfContext` field (or binding for script UDFs) provides the necessary functionality to create new UDT and tuple values: -[source,cql] +[source,sql] ---- include::cassandra:example$CQL/function_udfcontext.cql[] ---- @@ -598,7 +593,7 @@ include::cassandra:example$BNF/create_function_statement.bnf[] For example: -[source,cql] +[source,sql] ---- include::cassandra:example$CQL/create_function.cql[] ---- @@ -635,7 +630,7 @@ include::cassandra:example$BNF/drop_function_statement.bnf[] For example: -[source,cql] +[source,sql] ---- include::cassandra:example$CQL/drop_function.cql[] ---- @@ -661,14 +656,14 @@ The `count` function can be used to count the rows returned by a query. For example: -[source,cql] +[source,sql] ---- include::cassandra:example$CQL/count.cql[] ---- It also can count the non-null values of a given column: -[source,cql] +[source,sql] ---- include::cassandra:example$CQL/count_nonnull.cql[] ---- @@ -679,7 +674,7 @@ The `max` and `min` functions compute the maximum and the minimum value returned For example: -[source,cql] +[source,sql] ---- include::cassandra:example$CQL/min_max.cql[] ---- @@ -692,7 +687,7 @@ The returned value is of the same type as the input collection elements, so ther For example: -[source,cql] +[source,sql] ---- include::cassandra:example$CQL/sum.cql[] ---- @@ -701,7 +696,7 @@ The returned value is of the same type as the input values, so there is a risk o values exceeds the maximum value that the type can represent. You can use type casting to cast the input values as a type large enough to contain the type. For example: -[source,cql] +[source,sql] ---- include::cassandra:example$CQL/sum_with_cast.cql[] ---- @@ -712,7 +707,7 @@ The `avg` function computes the average of all the values returned by a query fo For example: -[source,cql] +[source,sql] ---- include::cassandra:example$CQL/avg.cql[] ---- @@ -723,7 +718,7 @@ The returned value is of the same type as the input values, which might include For example `collection_avg([1, 2])` returns `1` instead of `1.5`. You can use type casting to cast to a type with the desired decimal precision. For example: -[source,cql] +[source,sql] ---- include::cassandra:example$CQL/avg_with_cast.cql[] ---- @@ -747,7 +742,7 @@ overload can appear after creation of the aggregate. A complete working example for user-defined aggregates (assuming that a keyspace has been selected using the `USE` statement): -[source,cql] +[source,sql] ---- include::cassandra:example$CQL/uda.cql[] ---- @@ -790,7 +785,7 @@ If a `FINALFUNC` is defined, it is the return type of that function. [[drop-aggregate-statement]] === DROP AGGREGATE statement -Dropping an user-defined aggregate function uses the `DROP AGGREGATE` +Dropping a user-defined aggregate function uses the `DROP AGGREGATE` statement: [source, bnf] @@ -800,7 +795,7 @@ include::cassandra:example$BNF/drop_aggregate_statement.bnf[] For instance: -[source,cql] +[source,sql] ---- include::cassandra:example$CQL/drop_aggregate.cql[] ---- @@ -811,3 +806,20 @@ different signature. The `DROP AGGREGATE` command with the optional `IF EXISTS` keywords drops an aggregate if it exists, and does nothing if a function with the signature does not exist. + +[[writetime-and-ttl-functions]] +==== `WRITETIME`, `MINWRITETIME`, `MAXWRITETIME` and `TTL` functions + +These metadata functions are only allowed in `SELECT` statements: `WRITETIME`, `MINWRITETIME`, `MAXWRITETIME` and `TTL`. +The functions take only one argument, a column name, and retrieve metadata stored internally for the column. If the column is a collection or UDT, it's possible to add element +selectors, such as `WRITETTIME(phones[2..4])` or `WRITETTIME(user.name)`. + +* `WRITETIME` stores the timestamp of the value of the column. +* `MINWRITETIME` stores the smallest timestamp of the value of the column. For non-collection and non-UDT columns, `MINWRITETIME` +is equivalent to `WRITETIME`. In the other cases, it returns the smallest timestamp of the values in the column. +* `MAXWRITETIME` stores the largest timestamp of the value of the column. For non-collection and non-UDT columns, `MAXWRITETIME` +is equivalent to `WRITETIME`. In the other cases, it returns the largest timestamp of the values in the column. +* `TTL` stores the remaining time to live (in seconds) for the value of the column if it is set to expire; otherwise the value is `null`. + +The `WRITETIME` and `TTL` functions can be used on multi-cell columns such as non-frozen collections or non-frozen +user-defined types. In that case, the functions will return the list of timestamps or TTLs for each selected cell. diff --git a/doc/modules/cassandra/pages/developing/cql/types.adoc b/doc/modules/cassandra/pages/developing/cql/types.adoc index db46d4c6ac..d792e707af 100644 --- a/doc/modules/cassandra/pages/developing/cql/types.adoc +++ b/doc/modules/cassandra/pages/developing/cql/types.adoc @@ -10,6 +10,7 @@ types]: include::cassandra:example$BNF/cql_type.bnf[] ---- +[[native-types]] == Native types The native types supported by CQL are: @@ -191,6 +192,7 @@ a date context. A `1d` duration is not equal to a `24h` one as the duration type has been created to be able to support daylight saving. +[[collections]] == Collections CQL supports three kinds of collections: `maps`, `sets` and `lists`. The diff --git a/doc/modules/cassandra/partials/masking_functions.adoc b/doc/modules/cassandra/partials/masking_functions.adoc index 528da7d825..c7a6f289bf 100644 --- a/doc/modules/cassandra/partials/masking_functions.adoc +++ b/doc/modules/cassandra/partials/masking_functions.adoc @@ -14,7 +14,7 @@ Examples: | `mask_default(value)` | Replaces its argument by an arbitrary, fixed default value of the same type. -This will be `\***\***` for text values, zero for numeric values, `false` for booleans, etc. +This will be `$$****$$` for text values, zero for numeric values, `false` for booleans, etc. Variable-length multi-valued types such as lists, sets and maps are masked as empty collections. @@ -22,7 +22,7 @@ Fixed-length multi-valued types such as tuples, user-defined types (UDTs) and ve Examples: -`mask_default('Alice')` -> `'\****'` +`mask_default('Alice')` -> `'$$****$$'` `mask_default(123)` -> `0` @@ -43,32 +43,36 @@ Examples: | `mask_inner(value, begin, end, [padding])` | Returns a copy of the first `text`, `varchar` or `ascii` argument, replacing each character except the first and last ones by a padding character. The second and third arguments are the size of the exposed prefix and suffix. -The optional fourth argument is the padding character, `\*` by default. +The optional fourth argument is the padding character, `*` by default. Examples: -`mask_inner('Alice', 1, 2)` -> `'A**ce'` +`mask_inner('Alice', 1, 2)` -> `'A$$**$$ce'` -`mask_inner('Alice', 1, null)` -> `'A****'` +`mask_inner('Alice', 1, null)` -> `'A$$****$$'` -`mask_inner('Alice', null, 2)` -> `'***ce'` +`mask_inner('Alice', null, 2)` -> `'$$***$$ce'` -`mask_inner('Alice', 2, 1, '\#')` -> `'Al##e'` +`mask_inner('Alice', 2, 1, '$$#$$')` -> `'Al##e'` + +`mask_inner('078-05-1120', 0, 4)` -> `'$$*******$$1120'` | `mask_outer(value, begin, end, [padding])` | Returns a copy of the first `text`, `varchar` or `ascii` argument, replacing the first and last character by a padding character. The second and third arguments are the size of the exposed prefix and suffix. -The optional fourth argument is the padding character, `\*` by default. +The optional fourth argument is the padding character, `*` by default. Examples: -`mask_outer('Alice', 1, 2)` -> `'*li**'` +`mask_outer('Alice', 1, 2)` -> `'$$*$$li$$**$$'` + +`mask_outer('Alice', 1, null)` -> `'$$*$$lice'` -`mask_outer('Alice', 1, null)` -> `'*lice'` +`mask_outer('Alice', null, 2)` -> `'Ali$$**$$'` -`mask_outer('Alice', null, 2)` -> `'Ali**'` +`mask_outer('Alice', 2, 1, '$$#$$')` -> `'$$##$$ic$$#$$'` -`mask_outer('Alice', 2, 1, '\#')` -> `'##ic#'` +`mask_outer('11:39:33', 2, 2, '0')` -> `'00:39:00'` | `mask_hash(value, [algorithm])` | Returns a `blob` containing the hash of the first argument. --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org