[jira] [Commented] (LUCENE-5864) Split BytesRef into BytesRef and BytesRefBuilder

Robert Muir (JIRA) Thu, 31 Jul 2014 06:42:04 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080879#comment-14080879
 ]


Robert Muir commented on LUCENE-5864:
-------------------------------------

Thanks Adrien, this looks like a good step. 

A few questions on the API:
I think its confusing e.g. CharsRefBuilder has public char[] and length, but 
keeps a ref around uninitialized, until you call get().

Can we remove the length and char[]?

Basically I think it would be better if it only had a CharsRef, initialized of 
course to EMPTY in the ctor. grow() is the only thing changing the underlying 
char[], so it would basically just do ref.chars = grow(ref.chars...)). 

We could add a method length() that just then forwards to ref.length()

I think get() versus build() is a little confusing. I think with the above 
approach, the whole builder only tracks an internal ref state, so we would only 
need get(), and if someone wants a deep copy, they should do it explicitly 
(this is only rarely done). I feel also buildUTF8String() is only rarely done 
and not worth having the method when you can do get().utf8ToString().  i wish 
there was a better name for the get() method. get() to me is what something 
like AtomicXXX uses, but toXXX() like StringBuilder would also be confusing 
with the current code (although i would greatly prefer if this thing didnt 
allow access to its internal stuff and always created a new 'thing' so it could 
just be consistent with that API and hence easier to use).

Why are equals() and hashcode() throwing UOE? Because it not needed by anything 
at the moment? Maybe down the road, these builders should be "friendly" and 
more consistent with StringBuilder: means e.g. they support 
equals/hashcode/comparable, and charsrefbuilder supports Appendable and 
CharSequence and so on.  We dont need to do that here though, it can be a 
followup.


> Split BytesRef into BytesRef and BytesRefBuilder
> ------------------------------------------------
>
>                 Key: LUCENE-5864
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5864
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>             Fix For: 4.10
>
>         Attachments: LUCENE-5864.patch
>
>
> Follow-up of LUCENE-5836.
> The fact that BytesRef (and CharsRef, IntsRef, LongsRef) can be used as 
> either pointers to a section of a byte[] or as buffers raises issues. The 
> idea would be to keep BytesRef but remove all the buffer methods like 
> copyBytes, grow, etc. and add a new class BytesRefBuilder that wraps a byte[] 
> and a length (but no offset), has grow/copyBytes/copyChars methods and the 
> ability to build BytesRef instances.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-5864) Split BytesRef into BytesRef and BytesRefBuilder

Reply via email to