chenlinzhong opened a new pull request, #11761:
URL: https://github.com/apache/doris/pull/11761

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   
   ```
   //create one table such as table2
   CREATE TABLE `table2` (
     `event_day` date NULL,
     `siteid` int(11) NULL DEFAULT "10",
     `citycode` smallint(6) NULL,
     `visitinfo` varchar(1024) NULL DEFAULT "",
     `pv` varchar(1024) REPLACE NULL DEFAULT "0"
   ) ENGINE=OLAP
   AGGREGATE KEY(`event_day`, `siteid`, `citycode`, `visitinfo`)
   COMMENT 'OLAP'
   DISTRIBUTED BY HASH(`siteid`) BUCKETS 10
   PROPERTIES (
   "replication_allocation" = "tag.location.default: 1",
   "in_memory" = "false",
   "storage_format" = "V2"
   )
   
   //import some data 
   MySQL [test_db]> select * from table2;
   
+------------+--------+----------+------------------------------------+------+
   | event_day  | siteid | citycode | visitinfo                           | pv  
 |
   
+------------+--------+----------+------------------------------------+------+
   | 2017-07-03 |      8 |       12 | {"ip":"192.168.0.5","source":"pc"} | 81   
|
   | 2017-07-03 |     37 |       12 | {"ip":"192.168.0.3","source":"pc"} | 81   
|
   | 2017-07-03 |     67 |       16 | {"ip":"192.168.0.2","source":"pc"} | 79   
|
   | 2017-07-03 |    101 |       11 | {"ip":"192.168.0.5","source":"pc"} | 65   
|
   | 2017-07-03 |     32 |       15 | {"ip":"192.168.0.1","source":"pc"} | 188  
|
   | 2017-07-03 |    103 |       12 | {"ip":"192.168.0.5","source":"pc"} | 123  
|
   | 2017-07-03 |    104 |       16 | {"ip":"192.168.0.5","source":"pc"} | 79   
|
   | 2017-07-03 |      3 |       12 | {"ip":"192.168.0.3","source":"pc"} | 123  
|
   | 2017-07-03 |      3 |       15 | {"ip":"192.168.0.2","source":"pc"} | 188  
|
   | 2017-07-03 |     13 |       11 | {"ip":"192.168.0.1","source":"pc"} | 65   
|
   | 2017-07-03 |     53 |       12 | {"ip":"192.168.0.2","source":"pc"} | 123  
|
   | 2017-07-03 |      1 |       11 | {"ip":"192.168.0.1","source":"pc"} | 65   
|
   | 2017-07-03 |      7 |       16 | {"ip":"192.168.0.4","source":"pc"} | 79   
|
   | 2017-07-03 |    102 |       15 | {"ip":"192.168.0.5","source":"pc"} | 188  
|
   | 2017-07-03 |    105 |       12 | {"ip":"192.168.0.5","source":"pc"} | 81   
|
   
+------------+--------+----------+------------------------------------+------+
   
   ```
   
   ### 1. find most visit top 3 ip 
    not implement this demo in c++, if you are interested in this demo you can 
reference python/java implement
   ```
   MySQL [test_db]> CREATE AGGREGATE FUNCTION  
rpc_count_visit_info(varchar(1024)) RETURNS varchar(1024) PROPERTIES (
       "TYPE"="RPC",
       "OBJECT_FILE"="127.0.0.1:9000",
       "update_fn"="rpc_count_visit_info_update",
       "merge_fn"="rpc_count_visit_info_merge",
       "finalize_fn"="rpc_count_visit_info_finalize"
   );
   
   MySQL [test_db]> select rpc_count_visit_info(visitinfo) from table2;
   +--------------------------------------------+
   | rpc_count_visit_info(`visitinfo`)           |
   +--------------------------------------------+
   | 192.168.0.5:6 192.168.0.2:3 192.168.0.1:3  |
   +--------------------------------------------+
   1 row in set (0.036 sec)
   
   MySQL [test_db]> select citycode, rpc_count_visit_info(visitinfo) from 
table2 group by citycode;
   +----------+--------------------------------------------+
   | citycode | rpc_count_visit_info(`visitinfo`)           |
   +----------+--------------------------------------------+
   |       15 | 192.168.0.2:1 192.168.0.1:1 192.168.0.5:1  |
   |       11 | 192.168.0.1:2 192.168.0.5:1                |
   |       12 | 192.168.0.5:3 192.168.0.3:2 192.168.0.2:1  |
   |       16 | 192.168.0.2:1 192.168.0.4:1 192.168.0.5:1  |
   +----------+--------------------------------------------+
   4 rows in set (0.050 sec)
   
   ```
   ### 2. sum pv 
   ```
   CREATE AGGREGATE FUNCTION  rpc_sum(bigint) RETURNS bigint PROPERTIES (
       "TYPE"="RPC",
       "OBJECT_FILE"="127.0.0.1:9700",
       "update_fn"="rpc_sum_update",
       "merge_fn"="rpc_sum_merge",
       "finalize_fn"="rpc_sum_finalize"
   );
   
   MySQL [test_db]> select citycode, rpc_sum(pv) from table2 group by citycode;
   +----------+---------------+
   | citycode | rpc_sum(`pv`) |
   +----------+---------------+
   |       15 |           564 |
   |       11 |           195 |
   |       12 |           612 |
   |       16 |           237 |
   +----------+---------------+
   4 rows in set (0.067 sec)
   
   MySQL [test_db]> select rpc_sum(pv) from table2;
   +---------------+
   | rpc_sum(`pv`) |
   +---------------+
   |          1608 |
   +---------------+
   1 row in set (0.030 sec)
   ```
   
   ### 3. avg pv
   
   ```
   CREATE AGGREGATE FUNCTION  rpc_avg(int) RETURNS double PROPERTIES (
       "TYPE"="RPC",
       "OBJECT_FILE"="127.0.0.1:9000",
       "update_fn"="rpc_avg_update",
       "merge_fn"="rpc_avg_merge",
       "finalize_fn"="rpc_avg_finalize"
   );
   
   MySQL [test_db]> select citycode, rpc_avg(pv) from table2 group by citycode;
   +----------+---------------+
   | citycode | rpc_avg(`pv`) |
   +----------+---------------+
   |       15 |           188 |
   |       11 |            65 |
   |       12 |           102 |
   |       16 |            79 |
   +----------+---------------+
   4 rows in set (0.039 sec)
   
   MySQL [test_db]> select rpc_avg(pv) from table2;
   +---------------+
   | rpc_avg(`pv`) |
   +---------------+
   |         107.2 |
   +---------------+
   1 row in set (0.028 sec)
   
   ```
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
       - [ ] Yes
       - [ ] No
       - [ ] I don't know
   2. Has unit tests been added:
       - [ ] Yes
       - [ ] No
       - [ ] No Need
   3. Has document been added or modified:
       - [ ] Yes
       - [ ] No
       - [ ] No Need
   4. Does it need to update dependencies:
       - [ ] Yes
       - [ ] No
   5. Are there any changes that cannot be rolled back:
       - [ ] Yes (If Yes, please explain WHY)
       - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to