[ 
https://issues.apache.org/jira/browse/KUDU-2004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Percy reassigned KUDU-2004:
--------------------------------

         Assignee: Edward Fancher
      Code Review: http://gerrit.cloudera.org:8080/7141
    Fix Version/s: 1.5.0

> Undefined behavior in TlsSocket::Writev()
> -----------------------------------------
>
>                 Key: KUDU-2004
>                 URL: https://issues.apache.org/jira/browse/KUDU-2004
>             Project: Kudu
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 1.3.0, 1.4.0
>            Reporter: Mike Percy
>            Assignee: Edward Fancher
>             Fix For: 1.5.0
>
>
> I got a UBSAN error on Jenkins when running 
> TsRecoveryITest.TestCrashBeforeWriteLogSegmentHeader under ASAN. I had made a 
> one-line change that I was testing (unrelated to this). This is the error 
> message I got:
> {code}
> 2685 E0510 00:47:17.810153   656 fault_injection.cc:54] Injecting fault: 
> FLAGS_fault_crash_before_write_log_segment_header (process will exit)
> 2686 W0510 00:47:17.821826   451 connection.cc:462] client connection to 
> 127.121.138.0:50064 recv error: Network error: failed to read from TLS 
> socket: Connection reset by peer (error 104)
> 2687 ../../src/kudu/security/tls_socket.cc:80:19: runtime error: signed 
> integer overflow: 2018308256 + 2018308256 cannot be represented in type 'int'
> 2688 W0510 00:47:17.821890   460 connection.cc:462] server connection from 
> 127.121.138.0:36990 recv error: Network error: failed to read from TLS 
> socket: Connection reset by peer (error 104)
> 2689 W0510 00:47:17.822394   460 connection.cc:462] client connection to 
> 127.121.138.0:50064 recv error: Network error: failed to read from TLS 
> socket: Connection reset by peer (error 104)
> 2690 SUMMARY: AddressSanitizer: undefined-behavior 
> ../../src/kudu/security/tls_socket.cc:80:19 in
> {code}
> The code in question looks like this as of master rev 
> a877566e9477242c015758d105c8e616248af7c6
> {code}
>  69 Status TlsSocket::Writev(const struct ::iovec *iov, int iov_len, int32_t 
> *nwritten) {
>  70   SCOPED_OPENSSL_NO_PENDING_ERRORS;
>  71   CHECK(ssl_);
>  72   int32_t total_written = 0;
>  73   // Allows packets to be aggresively be accumulated before sending.
>  74   RETURN_NOT_OK(SetTcpCork(1));
>  75   Status write_status = Status::OK();
>  76   for (int i = 0; i < iov_len; ++i) {
>  77     int32_t frame_size = iov[i].iov_len;
>  78     // Don't return before unsetting TCP_CORK.
>  79     write_status = Write(static_cast<uint8_t*>(iov[i].iov_base), 
> frame_size, nwritten);
>  80     total_written += *nwritten;
>  81     if (*nwritten < frame_size) break;
>  82   }
>  83   RETURN_NOT_OK(SetTcpCork(0));
>  84   *nwritten = total_written;
>  85   return write_status;
>  86 }
> {code}
> I'm guessing what happened is the out-param was never set because Write() 
> returned a status code and we are reading whatever was on the stack.
> At the time of writing, the logs can be found here: 
> http://dist-test.cloudera.org/job?job_id=mpercy.1494377196.9182



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to