Hi,

Why TermAttributeImpl.clone() method uses buff.clone() instead of
System.arrayCopy to clone its internal buffer? Performance reasons?

I have the following scenario:

...
public boolean incrementToken() {
...
String twoHundredKCharsString = "abc....";
String smallString = "test";

termAttribute.setTermBuffer(twoHundredKCharsString);
State largeStringState = captureState();

termAttribute.setTermBuffer(smallString);
State smallStringState = captureState();

...
}
...

And guess what?! smallStringState has a TermAttribute object that
holds an internal buffer of 200k chars in size!!!

I was googling and found out that using cloning and arrayCopy has the
same performance for small arrays, and cloning just performs better
for large arrays.

So, if large string inputs are not a real scenario, why not use
arrayCopy instead of clone? But in case it's a real scenario, Lucene
should definitely not be copying the entire buffer for small strings.

Maybe TermAttribute interface could expose a method like
shrinkBuffer(), so the user could invoke when it needs to.

Thoughts?

Best Regards,
Adriano Crestani

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to