[
https://issues.apache.org/jira/browse/HBASE-18026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009186#comment-16009186
]
Anoop Sam John commented on HBASE-18026:
----------------------------------------
Well the concern was mainly abt the master branch patch. I have been working
in the PB area in master branch. We have gone to the new version of PB there.
Also when a request bytes been build up into a PB messages, we work with PB
CIS. Latest versions of PB allow to declare the incoming request bytes as
Immutable. If so , when we traverse over the bytes and make diff ByteString,
PB wont do any copy and exactly refer to only those bytes for a ByteString (eg:
Row PB object) Instead it will be a BoundedLiteralBS where we will have ref
to byte[] and offset and length. In such cases, the above assumptions wont be
correct. Well I realized now that the patch changed
org.apache.hadoop.hbase.protobuf.ProtobufUtil but in master branch we use,
org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil mainly (Specially for this
Get req convert etc). But the change as such is dangerous.
I did not check branch-1 based code path. When we make CIS from bytes, I dont
think we declare it as Immutable there. Means when PB create ByteString, it
will do a copy internally and as u said the byte [] will be 0 offset and
required length only. Also
{code}
public static byte[] zeroCopyGetBytes(final ByteString buf) {
if (buf instanceof LiteralByteString) {
return ((LiteralByteString) buf).bytes;
}
throw new UnsupportedOperationException("Need a LiteralByteString, got a "
+ buf.getClass().getName());
}
{code}
So iff it is LiteralByteString the same underlying byte[] been returned. Here
the offset will be 0 and length is correct. But any other type of ByteString ,
it throws Exception. May be on safe side we should just do toByteArray() for
such cases as well?
So functional wise the patch in any branch might not have been affected. In
master the PB working was is diff as I said but the change is not in a used
file. In older branches, the change might be ok as the PB usage is that way
there.
> ProtobufUtil seems to do extra array copying
> --------------------------------------------
>
> Key: HBASE-18026
> URL: https://issues.apache.org/jira/browse/HBASE-18026
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.0.0, 1.3.2
> Reporter: Vincent Poon
> Assignee: Vincent Poon
> Priority: Minor
> Fix For: 2.0.0, 1.4.0, 1.2.6, 1.3.2, 1.1.11
>
> Attachments: HBASE-18026.branch-1.v1.patch,
> HBASE-18026.master.v1.patch
>
>
> In ProtobufUtil, the protobuf fields are copied into an array using
> toByteArray(). These are then passed into the KeyValue constructor which
> does another copy.
> It seems like we can avoid a copy here by using
> HBaseZeroCopyByteString#zeroCopyGetBytes() ?
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)