Sahil Takiar created HDFS-14285: ----------------------------------- Summary: libhdfs hdfsRead copies entire array even if its only partially filled Key: HDFS-14285 URL: https://issues.apache.org/jira/browse/HDFS-14285 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client, libhdfs, native Reporter: Sahil Takiar
There is a bug in libhdfs {{hdfsRead}} {code:java} jthr = invokeMethod(env, &jVal, INSTANCE, jInputStream, HADOOP_ISTRM, "read", "([B)I", jbRarray); if (jthr) { destroyLocalReference(env, jbRarray); errno = printExceptionAndFree(env, jthr, PRINT_EXC_ALL, "hdfsRead: FSDataInputStream#read"); return -1; } if (jVal.i < 0) { // EOF destroyLocalReference(env, jbRarray); return 0; } else if (jVal.i == 0) { destroyLocalReference(env, jbRarray); errno = EINTR; return -1; } (*env)->GetByteArrayRegion(env, jbRarray, 0, noReadBytes, buffer); {code} The method makes a call to {{FSInputStream#read(byte[])}} to fill in the Java byte array, however, {{#read(byte[])}} is not guaranteed to fill up the entire array, instead it returns the number of bytes written to the array (which could be less than the size of the array). Yet `{{GetByteArrayRegion}} decides to copy the entire contents of the {{jbArray}} into the buffer ({{noReadBytes}} is initialized to the length of the buffer and is never updated). So if {{FSInputStream#read(byte[])}} decides to read less data than the size of the byte array, the call to {{GetByteArrayRegion}} will essentially copy more bytes than necessary. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org