[ 
https://issues.apache.org/jira/browse/KUDU-2463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16507719#comment-16507719
 ] 

Tomas Farkas commented on KUDU-2463:
------------------------------------

Resuming the workload and by re-running some partitions the problematic 
partitions dissapeared, now regardless the where condition the count is correct 
by range partition key (created_date). Yes, the table is 
updated/deleted/inserted. And the Kudu cluster had an issue with soft limits, 
so I had to restart, but after the restart did not want to start some of the 
Kudu tablet servers due to the known bug, so I had to upgrade to CDH 5.13.3 
where it was fixed. Then the kudu was healthy, but realized that this table is 
returning different results in impala. 
So maybe this change of version caused the issue.
The ksck returns OK.

> Different results returned by group by on count() metric
> --------------------------------------------------------
>
>                 Key: KUDU-2463
>                 URL: https://issues.apache.org/jira/browse/KUDU-2463
>             Project: Kudu
>          Issue Type: Bug
>          Components: impala
>    Affects Versions: 1.5.0
>            Reporter: Tomas Farkas
>            Priority: Critical
>
> Hi, 
> I have a static table in Kudu, no inserts/updates or deletes are running on 
> the cluster. The query returns DIFFERENT result when I change the where 
> condition on one of the primary key columns, which is in the group_by list.
> The created_date is part of the PK and is type of int.
> PK contains subscriber, time, date, identifier and created_date.
> I tried to check if the inserted count is equal to the HDFS table, and 
> noticed on one day, that the count differs based on the where criteria!!
> {quote} 
> {{[10.197.0.164:21000] > select created_date, count(*) from base.usage_kudu 
> where created_date >= 20180601 group by created_date;}}
>  {{Query: select created_date, count(*) from base.usage_kudu where 
> created_date >= 20180601 group by created_date}}
>  {{Query submitted at: 2018-06-04 21:06:30 (Coordinator: 
> [http://ip-10-197-0-164.eu-west-1.compute.internal:25000|http://ip-10-197-0-164.eu-west-1.compute.internal:25000/])}}
>  {{Query progress can be monitored at: 
> [http://ip-10-197-0-164.eu-west-1.compute.internal:25000/query_plan?query_id=ce4e92eda5aaa02f:ea07aa4600000000]}}
>  {{+---------------+---------+}}
>  {{| created_date | count(*) |}}
>  {{+---------------+---------+}}
>  {{| 20180603 | 12145131 |}}
>  {{| 20180601 | 18076448 | -> 195k MORE!!!}}
>  {{| 20180602 | 13325080 |}}
>  {{| 20180604 | 3788161 |}}
>  {{+---------------+---------+}}
>  {{Fetched 4 row(s) in 0.37s}}
>  {{[10.197.0.164:21000] >}}
>  {{[10.197.0.164:21000] >}}
>  {{[10.197.0.164:21000] > select created_date, count(*) from base.usage_kudu 
> where created_date >= 20180601 group by created_date order by 1;}}
>  {{Query: select created_date, count(*) from base.usage_kudu where 
> created_date >= 20180601 group by created_date order by 1}}
>  {{Query submitted at: 2018-06-04 21:06:55 (Coordinator: 
> [http://ip-10-197-0-164.eu-west-1.compute.internal:25000|http://ip-10-197-0-164.eu-west-1.compute.internal:25000/])}}
>  {{Query progress can be monitored at: 
> [http://ip-10-197-0-164.eu-west-1.compute.internal:25000/query_plan?query_id=d541a9dda19e28e4:be4a2ca000000000]}}
>  {{+---------------+---------+}}
>  {{| created_date | count(*) |}}
>  {{+---------------+---------+}}
>  {{| 20180601 | 18076448 | -> 195k MORE!!!}}
>  {{| 20180602 | 13325080 |}}
>  {{| 20180603 | 12145131 |}}
>  {{| 20180604 | 3788161 |}}
>  {{+---------------+---------+}}
>  {{Fetched 4 row(s) in 1.14s}}
>  {{[10.197.0.164:21000] > select created_date, count(*) from base.usage_kudu 
> where created_date >= 20180528 group by created_date order by 1;}}
>  {{Query: select created_date, count(*) from base.usage_kudu where 
> created_date >= 20180528 group by created_date order by 1}}
>  {{Query submitted at: 2018-06-04 21:07:12 (Coordinator: 
> [http://ip-10-197-0-164.eu-west-1.compute.internal:25000|http://ip-10-197-0-164.eu-west-1.compute.internal:25000/])}}
>  {{Query progress can be monitored at: 
> [http://ip-10-197-0-164.eu-west-1.compute.internal:25000/query_plan?query_id=774a261fb94ad2bb:aab28b8b00000000]}}
>  {{+---------------+---------+}}
>  {{| created_date | count(*) |}}
>  {{+---------------+---------+}}
>  {{| 20180528 | 17607943 |}}
>  {{| 20180529 | 20741097 |}}
>  {{| 20180530 | 17362364 |}}
>  {{| 20180531 | 16877228 |}}
>  \{{| 20180601 | 17925671 | -> 44k MORE!! }}
>  {{| 20180602 | 13325080 |}}
>  {{| 20180603 | 12145131 |}}
>  {{| 20180604 | 3788161 |}}
>  {{+---------------+---------+}}
>  {{Fetched 8 row(s) in 0.67s}}
>  {{[10.197.0.164:21000] > select created_date, count(*) from base.usage_kudu 
> where created_date >= 20180525 group by created_date order by 1;}}
>  {{Query: select created_date, count(*) from base.usage_kudu where 
> created_date >= 20180525 group by created_date order by 1}}
>  {{Query submitted at: 2018-06-04 21:07:25 (Coordinator: 
> [http://ip-10-197-0-164.eu-west-1.compute.internal:25000|http://ip-10-197-0-164.eu-west-1.compute.internal:25000/])}}
>  {{Query progress can be monitored at: 
> [http://ip-10-197-0-164.eu-west-1.compute.internal:25000/query_plan?query_id=38483ad3ae5c8eb9:a538cb6300000000]}}
>  {{+---------------+---------+}}
>  {{| created_date | count(*) |}}
>  {{+---------------+---------+}}
>  {{| 20180525 | 22309857 |}}
>  {{| 20180526 | 15268520 |}}
>  {{| 20180527 | 14939691 |}}
>  {{| 20180528 | 17607943 |}}
>  {{| 20180529 | 20741097 |}}
>  {{| 20180530 | 17362364 |}}
>  {{| 20180531 | 16903829 |}}
>  {{| 20180601 | 18047010 | -> 165k MORE!!!}}
>  {{| 20180602 | 13325080 |}}
>  {{| 20180603 | 12145131 |}}
>  {{| 20180604 | 3788161 |}}
>  {{+---------------+---------+}}
>  {{Fetched 11 row(s) in 0.85s}}
>  {{[10.197.0.164:21000] > select created_date, count(*) from base.usage_kudu 
> where created_date = 20180601 group by created_date;}}
>  {{Query: select created_date, count(*) from base.usage_kudu where 
> created_date = 20180601 group by created_date}}
>  {{Query submitted at: 2018-06-04 21:07:42 (Coordinator: 
> [http://ip-10-197-0-164.eu-west-1.compute.internal:25000|http://ip-10-197-0-164.eu-west-1.compute.internal:25000/])}}
>  {{Query progress can be monitored at: 
> [http://ip-10-197-0-164.eu-west-1.compute.internal:25000/query_plan?query_id=7343ba31f6b4c86f:621a7b8c00000000]}}
>  {{+---------------+---------+}}
>  {{| created_date | count(*) |}}
>  {{+---------------+---------+}}
>  {{| 20180601 | 17881253 | -> CORRECT ONE}}
>  {{+---------------+---------+}}
>  {{Fetched 1 row(s) in 0.27s}}
>  {{[10.197.0.164:21000] > select created_date, count(*) from base.usage_kudu 
> where created_date >= 20180525 group by created_date order by 1;}}
>  {{Query: select created_date, count(*) from base.usage_kudu where 
> created_date >= 20180525 group by created_date order by 1}}
>  {{Query submitted at: 2018-06-04 21:12:02 (Coordinator: 
> [http://ip-10-197-0-164.eu-west-1.compute.internal:25000|http://ip-10-197-0-164.eu-west-1.compute.internal:25000/])}}
>  {{Query progress can be monitored at: 
> [http://ip-10-197-0-164.eu-west-1.compute.internal:25000/query_plan?query_id=4141df26117f35c3:9ab2f0700000000]}}
>  {{+---------------+---------+}}
>  {{| created_date | count(*) |}}
>  {{+---------------+---------+}}
>  {{| 20180525 | 22309857 |}}
>  {{| 20180526 | 15268520 |}}
>  {{| 20180527 | 14939691 |}}
>  {{| 20180528 | 17607943 |}}
>  {{| 20180529 | 20741097 |}}
>  {{| 20180530 | 17362364 |}}
>  {{| 20180531 | 16903829 |}}
>  {{| 20180601 | 18047010 | -> AGAIN WRONG RESULT!!}}
>  {{| 20180602 | 13325080 |}}
>  {{| 20180603 | 12145131 |}}
>  {{| 20180604 | 3788161 |}}
>  {{+---------------+---------+}}
>  {{Fetched 11 row(s) in 1.04s}}{{}}{{}}
> Again, no other inserts/selects/updates or deletes were running between these 
> statements on the cluster. 
>  
> I checked the explain, if there is a difference,but it looks ok. But the 
> result is different!
>  
> {{[10.197.0.164:21000] > explain select created_date, count(*) from 
> base.usage_kudu where created_date = 20180601 group by created_date;}}
>  {{Query: explain select created_date, count(*) from base.usage_kudu where 
> created_date = 20180601 group by created_date}}
>  {{+--------------------------------------------------+}}
>  {{| Explain String |}}
>  {{+--------------------------------------------------+}}
>  {{| Max Per-Host Resource Reservation: Memory=3.94MB |}}
>  {{| Per-Host Resource Estimates: Memory=20.00MB |}}
>  {{| |}}
>  {{| PLAN-ROOT SINK |}}
>  {{| | |}}
>  {{| 04:EXCHANGE [UNPARTITIONED] |}}
>  {{| | |}}
>  {{| 03:AGGREGATE [FINALIZE] |}}
>  {{| | output: count:merge(*) |}}
>  {{| | group by: created_date |}}
>  {{| | |}}
>  {{| 02:EXCHANGE [HASH(created_date)] |}}
>  {{| | |}}
>  {{| 01:AGGREGATE [STREAMING] |}}
>  {{| | output: count(*) |}}
>  {{| | group by: created_date |}}
>  {{| | |}}
>  {{| 00:SCAN KUDU [base.usage_kudu] |}}
>  {{| kudu predicates: created_date = 20180601 |}}
>  {{+--------------------------------------------------+}}
>  {{Fetched 19 row(s) in 0.06s}}
>  {{[10.197.0.164:21000] > select created_date, count(*) from base.usage_kudu 
> where created_date = 20180601 group by created_date;}}
>  {{Query: select created_date, count(*) from base.usage_kudu where 
> created_date = 20180601 group by created_date}}
>  {{Query submitted at: 2018-06-04 21:17:21 (Coordinator: 
> [http://ip-10-197-0-164.eu-west-1.compute.internal:25000|http://ip-10-197-0-164.eu-west-1.compute.internal:25000/])}}
>  {{Query progress can be monitored at: 
> [http://ip-10-197-0-164.eu-west-1.compute.internal:25000/query_plan?query_id=c449aabea51e7456:612f096400000000]}}
>  {{+---------------+---------+}}
>  {{| created_date | count(*) |}}
>  {{+---------------+---------+}}
>  {{| 20180601 | 17881253 |}}
>  {{+---------------+---------+}}
>  {{Fetched 1 row(s) in 0.38s}}
>  {{[10.197.0.164:21000] > explain select created_date, count(*) from 
> base.usage_kudu where created_date >= 20180525 group by created_date order by 
> 1;}}
>  {{Query: explain select created_date, count(*) from base.usage_kudu where 
> created_date >= 20180525 group by created_date order by 1}}
>  {{+--------------------------------------------------+}}
>  {{| Explain String |}}
>  {{+--------------------------------------------------+}}
>  {{| Max Per-Host Resource Reservation: Memory=9.94MB |}}
>  {{| Per-Host Resource Estimates: Memory=26.00MB |}}
>  {{| |}}
>  {{| PLAN-ROOT SINK |}}
>  {{| | |}}
>  {{| 05:MERGING-EXCHANGE [UNPARTITIONED] |}}
>  {{| | order by: created_date ASC |}}
>  {{| | |}}
>  {{| 02:SORT |}}
>  {{| | order by: created_date ASC |}}
>  {{| | |}}
>  {{| 04:AGGREGATE [FINALIZE] |}}
>  {{| | output: count:merge(*) |}}
>  {{| | group by: created_date |}}
>  {{| | |}}
>  {{| 03:EXCHANGE [HASH(created_date)] |}}
>  {{| | |}}
>  {{| 01:AGGREGATE [STREAMING] |}}
>  {{| | output: count(*) |}}
>  {{| | group by: created_date |}}
>  {{| | |}}
>  {{| 00:SCAN KUDU [base.usage_kudu] |}}
>  {{| kudu predicates: created_date >= 20180525 |}}
>  {{+--------------------------------------------------+}}
>  {{Fetched 23 row(s) in 0.05s}}
>  {{[10.197.0.164:21000] > select created_date, count(*) from base.usage_kudu 
> where created_date >= 20180525 group by created_date order by 1;}}
>  {{Query: select created_date, count(*) from base.usage_kudu where 
> created_date >= 20180525 group by created_date order by 1}}
>  {{Query submitted at: 2018-06-04 21:17:32 (Coordinator: 
> [http://ip-10-197-0-164.eu-west-1.compute.internal:25000|http://ip-10-197-0-164.eu-west-1.compute.internal:25000/])}}
>  {{Query progress can be monitored at: 
> [http://ip-10-197-0-164.eu-west-1.compute.internal:25000/query_plan?query_id=bc4a36f2a7ad3280:c7b09a5100000000]}}
>  {{+---------------+---------+}}
>  {{| created_date | count(*) |}}
>  {{+---------------+---------+}}
>  {{| 20180525 | 22309857 |}}
>  {{| 20180526 | 15268520 |}}
>  {{| 20180527 | 14939691 |}}
>  {{| 20180528 | 17607943 |}}
>  {{| 20180529 | 20741097 |}}
>  {{| 20180530 | 17362364 |}}
>  {{| 20180531 | 16903829 |}}
>  {{| 20180601 | 18047010 |}}
>  {{| 20180602 | 13325080 |}}
>  {{| 20180603 | 12145131 |}}
>  {{| 20180604 | 3788161 |}}
>  {{+---------------+---------+}}
>  {{Fetched 11 row(s) in 0.88s}}
>  
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to