Gopal V created HIVE-14573:
------------------------------

             Summary: Vectorization: Implement StringExpr::find() 
                 Key: HIVE-14573
                 URL: https://issues.apache.org/jira/browse/HIVE-14573
             Project: Hive
          Issue Type: Bug
            Reporter: Gopal V


Currently, the LIKE expression implementation is a dump StringExpr::equals() 
loop.

For an input of N bytes and a pattern of M bytes, this has the complexity of 
((N-M)*M), which is not an issue with small patterns or small inputs.

The pattern matching is currently optimized for matches, while in clickstream 
data the opposite is true in general.

>From the common crawl data, the following run will go through the same

{code}
select count(1) from uservisits_orc_data where useragent like "%Opera%" and 
searchword LIKE "%fruit%";
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to