[
https://issues.apache.org/jira/browse/CALCITE-2907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lai Zhou updated CALCITE-2907:
------------------------------
Description:
In my usecase:
an Aggegate which contains distinct call was converted improperly to an error
relational algebra.
{code:java}
SELECT user_id,
order_id,
product_id,
count(DISTINCT phone) AS contact_count,
count(DISTINCT (CASE
WHEN is_cell_phone=0 THEN phone
END)) AS fixedphone_count,
count(DISTINCT (CASE
WHEN is_cell_phone=1 THEN phone
END)) AS telehone_count,
count(DISTINCT substr((CASE
WHEN is_cell_phone=1 THEN
secured_libs.u51decrypt(phone)
END),1,3)) AS seg1uv,
count(DISTINCT substr((CASE
WHEN is_cell_phone=1 THEN
secured_libs.u51decrypt(phone)
END),4,4)) AS seg2uv,
count(DISTINCT substr((CASE
WHEN is_cell_phone=1 THEN
secured_libs.u51decrypt(phone)
END),8,4)) AS seg3uv,
stddev_pop(substr((CASE
WHEN is_cell_phone=1 THEN
secured_libs.u51decrypt(phone)
END),1,3)) AS seg1stddev,
stddev_pop(substr((CASE
WHEN is_cell_phone=1 THEN
secured_libs.u51decrypt(phone)
END),4,4)) AS seg2stddev,
stddev_pop(substr((CASE
WHEN is_cell_phone=1 THEN
secured_libs.u51decrypt(phone)
END),8,4)) AS seg3stddev,
entropy(substr((CASE
WHEN is_cell_phone=1 THEN
secured_libs.u51decrypt(phone)
END),1,3)) AS seg1entropy,
entropy(substr((CASE
WHEN is_cell_phone=1 THEN
secured_libs.u51decrypt(phone)
END),4,4)) AS seg2entropy,
entropy(substr((CASE
WHEN is_cell_phone=1 THEN
secured_libs.u51decrypt(phone)
END),8,4)) AS seg3entropy
FROM dw_risk__mygravitation_v_snap_contacts_contacts
GROUP BY user_id,
order_id,
product_id
{code}
After digging into the code,I found at the line 444 of the
AggregateExpandDistinctAggregatesRule.java :
{code:java}
int x = groupCount;
final List<AggregateCall> newCalls = new ArrayList<>();
for (AggregateCall aggCall : aggregate.getAggCallList()) {
final int newFilterArg;
final List<Integer> newArgList;
final SqlAggFunction aggregation;
if (!aggCall.isDistinct()) {
aggregation = SqlStdOperatorTable.MIN;
newArgList = ImmutableIntList.of(x++);
newFilterArg = filters.get(aggregate.getGroupSet());
} else {
{code}
the undistinct aggregate call `stddev_pop` and `entropy` was converted to a
SqlStdOperatorTable.MIN,(actually all undistinct aggregate call here will be
converted to a SqlStdOperatorTable.MIN ).
I don't understand how it works.
I guess someone make a faulty assumption here. [~julianhyde] ,can someone help
me ASAP ?
It’s very important for my business.
was:
In my usecase:
an Aggegate which contains distinct call was converted improperly to an error
relational algebra.
{code:java}
SELECT user_id,
order_id,
product_id,
count(DISTINCT phone) AS contact_count,
count(DISTINCT (CASE
WHEN is_cell_phone=0 THEN phone
END)) AS fixedphone_count,
count(DISTINCT (CASE
WHEN is_cell_phone=1 THEN phone
END)) AS telehone_count,
count(DISTINCT substr((CASE
WHEN is_cell_phone=1 THEN
secured_libs.u51decrypt(phone)
END),1,3)) AS seg1uv,
count(DISTINCT substr((CASE
WHEN is_cell_phone=1 THEN
secured_libs.u51decrypt(phone)
END),4,4)) AS seg2uv,
count(DISTINCT substr((CASE
WHEN is_cell_phone=1 THEN
secured_libs.u51decrypt(phone)
END),8,4)) AS seg3uv,
stddev_pop(substr((CASE
WHEN is_cell_phone=1 THEN
secured_libs.u51decrypt(phone)
END),1,3)) AS seg1stddev,
stddev_pop(substr((CASE
WHEN is_cell_phone=1 THEN
secured_libs.u51decrypt(phone)
END),4,4)) AS seg2stddev,
stddev_pop(substr((CASE
WHEN is_cell_phone=1 THEN
secured_libs.u51decrypt(phone)
END),8,4)) AS seg3stddev,
entropy(substr((CASE
WHEN is_cell_phone=1 THEN
secured_libs.u51decrypt(phone)
END),1,3)) AS seg1entropy,
entropy(substr((CASE
WHEN is_cell_phone=1 THEN
secured_libs.u51decrypt(phone)
END),4,4)) AS seg2entropy,
entropy(substr((CASE
WHEN is_cell_phone=1 THEN
secured_libs.u51decrypt(phone)
END),8,4)) AS seg3entropy
FROM dw_risk__mygravitation_v_snap_contacts_contacts
GROUP BY user_id,
order_id,
product_id
{code}
After digging into the code,I found at the line 444 of the
AggregateExpandDistinctAggregatesRule.java :
{code:java}
int x = groupCount;
final List<AggregateCall> newCalls = new ArrayList<>();
for (AggregateCall aggCall : aggregate.getAggCallList()) {
final int newFilterArg;
final List<Integer> newArgList;
final SqlAggFunction aggregation;
if (!aggCall.isDistinct()) {
aggregation = SqlStdOperatorTable.MIN;
newArgList = ImmutableIntList.of(x++);
newFilterArg = filters.get(aggregate.getGroupSet());
} else {
{code}
the undistinct aggregate call `stddev_pop` was converted to a
SqlStdOperatorTable.MIN, I don't understand how it works.
I guess someone make a faulty assumption here. [~julianhyde] ,can someone help
me?
It’s very important for my business.
> AggregateExpandDistinctAggregatesRule produces a wrong relational algebra
> -------------------------------------------------------------------------
>
> Key: CALCITE-2907
> URL: https://issues.apache.org/jira/browse/CALCITE-2907
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.18.0
> Reporter: Lai Zhou
> Priority: Major
>
> In my usecase:
> an Aggegate which contains distinct call was converted improperly to an error
> relational algebra.
>
> {code:java}
> SELECT user_id,
> order_id,
> product_id,
> count(DISTINCT phone) AS contact_count,
> count(DISTINCT (CASE
> WHEN is_cell_phone=0 THEN phone
> END)) AS fixedphone_count,
> count(DISTINCT (CASE
> WHEN is_cell_phone=1 THEN phone
> END)) AS telehone_count,
> count(DISTINCT substr((CASE
> WHEN is_cell_phone=1 THEN
> secured_libs.u51decrypt(phone)
> END),1,3)) AS seg1uv,
> count(DISTINCT substr((CASE
> WHEN is_cell_phone=1 THEN
> secured_libs.u51decrypt(phone)
> END),4,4)) AS seg2uv,
> count(DISTINCT substr((CASE
> WHEN is_cell_phone=1 THEN
> secured_libs.u51decrypt(phone)
> END),8,4)) AS seg3uv,
> stddev_pop(substr((CASE
> WHEN is_cell_phone=1 THEN
> secured_libs.u51decrypt(phone)
> END),1,3)) AS seg1stddev,
> stddev_pop(substr((CASE
> WHEN is_cell_phone=1 THEN
> secured_libs.u51decrypt(phone)
> END),4,4)) AS seg2stddev,
> stddev_pop(substr((CASE
> WHEN is_cell_phone=1 THEN
> secured_libs.u51decrypt(phone)
> END),8,4)) AS seg3stddev,
> entropy(substr((CASE
> WHEN is_cell_phone=1 THEN
> secured_libs.u51decrypt(phone)
> END),1,3)) AS seg1entropy,
> entropy(substr((CASE
> WHEN is_cell_phone=1 THEN
> secured_libs.u51decrypt(phone)
> END),4,4)) AS seg2entropy,
> entropy(substr((CASE
> WHEN is_cell_phone=1 THEN
> secured_libs.u51decrypt(phone)
> END),8,4)) AS seg3entropy
> FROM dw_risk__mygravitation_v_snap_contacts_contacts
> GROUP BY user_id,
> order_id,
> product_id
> {code}
>
>
> After digging into the code,I found at the line 444 of the
> AggregateExpandDistinctAggregatesRule.java :
>
> {code:java}
> int x = groupCount;
> final List<AggregateCall> newCalls = new ArrayList<>();
> for (AggregateCall aggCall : aggregate.getAggCallList()) {
> final int newFilterArg;
> final List<Integer> newArgList;
> final SqlAggFunction aggregation;
> if (!aggCall.isDistinct()) {
> aggregation = SqlStdOperatorTable.MIN;
> newArgList = ImmutableIntList.of(x++);
> newFilterArg = filters.get(aggregate.getGroupSet());
> } else {
> {code}
> the undistinct aggregate call `stddev_pop` and `entropy` was converted to a
> SqlStdOperatorTable.MIN,(actually all undistinct aggregate call here will be
> converted to a SqlStdOperatorTable.MIN ).
> I don't understand how it works.
> I guess someone make a faulty assumption here. [~julianhyde] ,can someone
> help me ASAP ?
> It’s very important for my business.
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)