Hello, We see that BulkOutputFormat fails to stream data from multiple reduce instances that run on the same host. We get the same error messages that issue https://issues.apache.org/jira/browse/CASSANDRA-4223 tries to address. Looks like (ip-adress + in_out_flag + atomic integer) is not unique enough for a sessionId when we have multiple JVMs streaming from one physical host.
We get the problem fixed by setting one reducer per machine in hadoop config, but it's not an option we want to deploy. Thanks, Alexei Bakanov