zbtzbtzbt opened a new pull request #8214:
URL: https://github.com/apache/incubator-doris/pull/8214


   # Proposed changes
   
   高基数列原始的字符串等值比较场景优化
   
   ## Problem Summary:
   
   这个pr是对之后低基数列优化的补充,针对的是高基数列原始的字符串等值比较场景
   
   doris通用的字符串比较算法很快了,测试中发现和clickhouse的字符串比较速度差不多
   思路都是用_SSE_4处理前面的16*k位(k是整数),剩下的<=15位再遍历一遍
   
   对于高基数列,doris使用原始的字符串(char*)进行比较
   这个pr对于sql中有str1==str2(或者str1!=str2)的长字符串等值比较有一定的加速效果
   
   ## Checklist(Required)
   <img width="536" alt="111" 
src="https://user-images.githubusercontent.com/35688959/155321458-31ce65ad-2cc8-49ac-8940-0b26f83701cd.png";>
   在多次测试取平均值后(部分一两次测试由于数据集等原因可能有波动),比doris通用的字符串比较算法有20%左右的提速
   
   
   测试代码:
   https://godbolt.org/z/689bGnMzs
   
   @wangbo @HappenLee @zenoyang 
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to