[
https://issues.apache.org/jira/browse/SPARK-55749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun updated SPARK-55749:
----------------------------------
Affects Version/s: 4.2.0
(was: 3.4.3)
> array_contains built in function result are inconsistent with other engines
> ---------------------------------------------------------------------------
>
> Key: SPARK-55749
> URL: https://issues.apache.org/jira/browse/SPARK-55749
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 4.2.0
> Reporter: Oleks V.
> Priority: Major
>
> Hi community
>
> I would love to get help on `array_contains` builtin function results when
> NULL is one of the values in the array. The function returns NULL which is
> opposite to other engines like DuckDB/Postresql.
>
> Example:
> {code:java}
> scala> spark.sql("select array_contains(array(1, null, 3), 2)").show(false)
> +------------------------------------+
> |array_contains(array(1, NULL, 3), 2)|
> +------------------------------------+
> |null |
> +------------------------------------+
> {code}
> For instance DuckDB returns false
>
>
> {code:java}
> D select array_has([1, null, 3], 2);
> ┌───────────────────────────────────────────┐
> │ array_has(main.list_value(1, NULL, 3), 2) │
> │ boolean │
> ├───────────────────────────────────────────┤
> │ false │
> └───────────────────────────────────────────┘
> D {code}
>
> Postgres doesn't have *array_contains* in Spark/DuckDB understanding but the
> closest analogue would be
>
> {code:java}
> select array[1, null, 3] @> array[2]
> false {code}
> For this ticket I'd like to understand if this behavior for Apache Spark is
> correct, because the documentation for the function does not mention such
> scenario explicitly
>
> {code:java}
> /**
> * Checks if the array (left) has the element (right)
> */
> @ExpressionDescription(
> usage = "_FUNC_(array, value) - Returns true if the array contains the
> value.",
> examples = """
> Examples:
> > SELECT _FUNC_(array(1, 2, 3), 2);
> true
> """,
> group = "array_funcs",
> since = "1.5.0")
> case class ArrayContains(left: Expression, right: Expression) {code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]