[
https://issues.apache.org/jira/browse/KUDU-2004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mike Percy reassigned KUDU-2004:
--------------------------------
Assignee: Edward Fancher
Code Review: http://gerrit.cloudera.org:8080/7141
Fix Version/s: 1.5.0
> Undefined behavior in TlsSocket::Writev()
> -----------------------------------------
>
> Key: KUDU-2004
> URL: https://issues.apache.org/jira/browse/KUDU-2004
> Project: Kudu
> Issue Type: Bug
> Components: security
> Affects Versions: 1.3.0, 1.4.0
> Reporter: Mike Percy
> Assignee: Edward Fancher
> Fix For: 1.5.0
>
>
> I got a UBSAN error on Jenkins when running
> TsRecoveryITest.TestCrashBeforeWriteLogSegmentHeader under ASAN. I had made a
> one-line change that I was testing (unrelated to this). This is the error
> message I got:
> {code}
> 2685 E0510 00:47:17.810153 656 fault_injection.cc:54] Injecting fault:
> FLAGS_fault_crash_before_write_log_segment_header (process will exit)
> 2686 W0510 00:47:17.821826 451 connection.cc:462] client connection to
> 127.121.138.0:50064 recv error: Network error: failed to read from TLS
> socket: Connection reset by peer (error 104)
> 2687 ../../src/kudu/security/tls_socket.cc:80:19: runtime error: signed
> integer overflow: 2018308256 + 2018308256 cannot be represented in type 'int'
> 2688 W0510 00:47:17.821890 460 connection.cc:462] server connection from
> 127.121.138.0:36990 recv error: Network error: failed to read from TLS
> socket: Connection reset by peer (error 104)
> 2689 W0510 00:47:17.822394 460 connection.cc:462] client connection to
> 127.121.138.0:50064 recv error: Network error: failed to read from TLS
> socket: Connection reset by peer (error 104)
> 2690 SUMMARY: AddressSanitizer: undefined-behavior
> ../../src/kudu/security/tls_socket.cc:80:19 in
> {code}
> The code in question looks like this as of master rev
> a877566e9477242c015758d105c8e616248af7c6
> {code}
> 69 Status TlsSocket::Writev(const struct ::iovec *iov, int iov_len, int32_t
> *nwritten) {
> 70 SCOPED_OPENSSL_NO_PENDING_ERRORS;
> 71 CHECK(ssl_);
> 72 int32_t total_written = 0;
> 73 // Allows packets to be aggresively be accumulated before sending.
> 74 RETURN_NOT_OK(SetTcpCork(1));
> 75 Status write_status = Status::OK();
> 76 for (int i = 0; i < iov_len; ++i) {
> 77 int32_t frame_size = iov[i].iov_len;
> 78 // Don't return before unsetting TCP_CORK.
> 79 write_status = Write(static_cast<uint8_t*>(iov[i].iov_base),
> frame_size, nwritten);
> 80 total_written += *nwritten;
> 81 if (*nwritten < frame_size) break;
> 82 }
> 83 RETURN_NOT_OK(SetTcpCork(0));
> 84 *nwritten = total_written;
> 85 return write_status;
> 86 }
> {code}
> I'm guessing what happened is the out-param was never set because Write()
> returned a status code and we are reading whatever was on the stack.
> At the time of writing, the logs can be found here:
> http://dist-test.cloudera.org/job?job_id=mpercy.1494377196.9182
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)