[
https://issues.apache.org/jira/browse/CALCITE-5644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tanner Clary updated CALCITE-5644:
----------------------------------
Description:
BigQuery offers the
[CONTAINS_SUBSTR|https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#contains_substr]
function that returns {{TRUE}} if a substring is present in an expression and
{{FALSE}} if it is not. A basic example of this may be seen in [1].
The expression can take many forms (more info in the linked doc) which makes
its implementation more complex than other string functions that only accept
arguments of type {{STRING}}. For instance, the expression to be searched can
be a column or table reference.
The function also has an optional third argument called {{json_scope}} where
the user can indicate the scope of JSON data (keys, values, or both) to be
searched for the substring.
I am curious if anyone has thoughts on how the search of rows or tables could
be implemented. I have a basic implementation (that supports expressions of
type {{STRING}} and nothing else) that I will open a draft PR for as a starting
point. To me, the challenge is implementing the additional features like the
{{JSON_SCOPE}} argument (seen in [2]) and performing a cross field search as
seen in [3].
[1] {{SELECT CONTAINS_SUBSTR("hello", "he");}} would return {{TRUE}}.
[2]
{{SELECT *
FROM Recipes
WHERE CONTAINS_SUBSTR(
(SELECT AS STRUCT Recipes.* EXCEPT (Lunch, Dinner)),
'potato'
);}} would return:
/*-------------------+-------------------------+------------------*
| Breakfast | Lunch | Dinner
|
+-------------------+-------------------------+------------------+
| Potato pancakes | Toasted cheese sandwich | Beef stroganoff |
*-------------------+-------------------------+------------------*/
was:
BigQuery offers the
[CONTAINS_SUBSTR|https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#contains_substr]
function that returns {{TRUE}} if a substring is present in an expression and
{{FALSE}} if it is not. A basic example of this may be seen in [1].
The expression can take many forms (more info in the linked doc) which makes
its implementation more complex than other string functions that only accept
arguments of type {{STRING}}. For instance, the expression to be searched can
be a column or table reference.
The function also has an optional third argument called {{json_scope}} where
the user can indicate the scope of JSON data (keys, values, or both) to be
searched for the substring.
I am curious if anyone has thoughts on how the search of rows or tables could
be implemented. I have a basic implementation (that supports expressions of
type {{STRING}} and nothing else) that I will open a draft PR for as a starting
point. To me, the challenge is implementing the additional features like the
{{JSON_SCOPE}} argument (seen in [2]) and performing a cross field search as
seen in [3].
[1] {{SELECT CONTAINS_SUBSTR("hello", "he");}} would return {{TRUE}}.
[2]
{{SELECT *
FROM Recipes
WHERE CONTAINS_SUBSTR(
(SELECT AS STRUCT Recipes.* EXCEPT (Lunch, Dinner)),
'potato'
);}} would return:
/*-------------------+-------------------------+------------------*
| Breakfast | Lunch | Dinner
|
+-------------------+-------------------------+------------------+
| Potato pancakes | Toasted cheese sandwich | Beef stroganoff |
*-------------------+-------------------------+------------------*/
> Implement BigQuery CONTAINS_SUBSTR
> ----------------------------------
>
> Key: CALCITE-5644
> URL: https://issues.apache.org/jira/browse/CALCITE-5644
> Project: Calcite
> Issue Type: Task
> Reporter: Tanner Clary
> Priority: Major
>
> BigQuery offers the
> [CONTAINS_SUBSTR|https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#contains_substr]
> function that returns {{TRUE}} if a substring is present in an expression
> and {{FALSE}} if it is not. A basic example of this may be seen in [1].
> The expression can take many forms (more info in the linked doc) which makes
> its implementation more complex than other string functions that only accept
> arguments of type {{STRING}}. For instance, the expression to be searched can
> be a column or table reference.
> The function also has an optional third argument called {{json_scope}} where
> the user can indicate the scope of JSON data (keys, values, or both) to be
> searched for the substring.
> I am curious if anyone has thoughts on how the search of rows or tables
> could be implemented. I have a basic implementation (that supports
> expressions of type {{STRING}} and nothing else) that I will open a draft PR
> for as a starting point. To me, the challenge is implementing the additional
> features like the {{JSON_SCOPE}} argument (seen in [2]) and performing a
> cross field search as seen in [3].
> [1] {{SELECT CONTAINS_SUBSTR("hello", "he");}} would return {{TRUE}}.
> [2]
> {{SELECT *
> FROM Recipes
> WHERE CONTAINS_SUBSTR(
> (SELECT AS STRUCT Recipes.* EXCEPT (Lunch, Dinner)),
> 'potato'
> );}} would return:
> /*-------------------+-------------------------+------------------*
> | Breakfast | Lunch | Dinner
> |
> +-------------------+-------------------------+------------------+
> | Potato pancakes | Toasted cheese sandwich | Beef stroganoff |
> *-------------------+-------------------------+------------------*/
--
This message was sent by Atlassian Jira
(v8.20.10#820010)