[ 
https://issues.apache.org/jira/browse/HIVE-22929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17047281#comment-17047281
 ] 

Krisztian Kasa commented on HIVE-22929:
---------------------------------------

[~gopalv]
String.replace implementation is:
{code}
    public String replace(CharSequence target, CharSequence replacement) {
        return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(
                
this).replaceAll(Matcher.quoteReplacement(replacement.toString()));
    }
{code}
So it also calls Pattern.compile with *target* every time it called.

The difference between replace and replaceAll is:
{code}
replace
Pattern.compile(target.toString(), Pattern.LITERAL)
{code}
{code}
replaceAll
Pattern.compile(regex)
{code}

I did some testing:
{code}
  public static final Pattern REGEX = Pattern.compile("am", Pattern.LITERAL);
  @Test
  public void testReplacePerf() {
    long count = 10000000;

    long start = System.currentTimeMillis();
    for (int i = 0; i < count; ++i) {
      String s = "sample sample".replaceAll("am", "b");
    }
    System.out.println("String.replaceAll: " + (System.currentTimeMillis() - 
start));

    start = System.currentTimeMillis();
    for (int i = 0; i < count; ++i) {
      String s = "sample sample".replace("am", "b");
    }
    System.out.println("String.replace: " + (System.currentTimeMillis() - 
start));

    start = System.currentTimeMillis();
    for (int i = 0; i < count; ++i) {
      String s = RegExUtils.replaceAll("sample sample", REGEX, "b");
    }
    System.out.println("Precompiled regex + RegExUtils.replaceAll:" + 
(System.currentTimeMillis() - start));
  }
{code}
{code}
String.replaceAll: 3997
String.replace: 3028
Precompiled regex + RegExUtils.replaceAll:2164
{code}

Please share your thoughts.

> Performance: quoted identifier parsing uses throwaway Regex via 
> String.replaceAll()
> -----------------------------------------------------------------------------------
>
>                 Key: HIVE-22929
>                 URL: https://issues.apache.org/jira/browse/HIVE-22929
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Gopal Vijayaraghavan
>            Assignee: Krisztian Kasa
>            Priority: Major
>         Attachments: HIVE-22929.1.patch, String.replaceAll.png
>
>
>  !String.replaceAll.png! 
> https://github.com/apache/hive/blob/master/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g#L530
> {code}
>     '`'  ( '``' | ~('`') )* '`' { setText(getText().substring(1, 
> getText().length() -1 ).replaceAll("``", "`")); }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to