Gopal V created HIVE-14573: ------------------------------ Summary: Vectorization: Implement StringExpr::find() Key: HIVE-14573 URL: https://issues.apache.org/jira/browse/HIVE-14573 Project: Hive Issue Type: Bug Reporter: Gopal V
Currently, the LIKE expression implementation is a dump StringExpr::equals() loop. For an input of N bytes and a pattern of M bytes, this has the complexity of ((N-M)*M), which is not an issue with small patterns or small inputs. The pattern matching is currently optimized for matches, while in clickstream data the opposite is true in general. >From the common crawl data, the following run will go through the same {code} select count(1) from uservisits_orc_data where useragent like "%Opera%" and searchword LIKE "%fruit%"; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)