----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28779/ -----------------------------------------------------------
(Updated Dec. 8, 2014, 7:39 p.m.) Review request for hive, Brock Noland, chengxiang li, Szehon Ho, and Xuefu Zhang. Bugs: HIVE-9036 https://issues.apache.org/jira/browse/HIVE-9036 Repository: hive-git Description (updated) ------- This is the RPC layer for the spark-client; it just provides the RPC mechanism for replacing akka in the spark-client communication. See README.md for more discussion about the approach. The client impl is still using akka - that will be changed in a separate commit. [spark-client] Goodbye akka, hello netty. Use the netty-based RPC implementation for spark-client. API is still exactly the same, just the internals have changed. API semantics should still be the same too, although with the new RPC system there is still room for improvement. Main thing to remember right now is that all types sent over the wire need an empty constructor. Currently, failure to do that will result in errors to be printed to the logs and RPC channels to close, but for easier debugging in the future I'm planning to change the RPC internals a bit so that failure to deserialize the payload causes an RPC failure, instead of the current behavior. [spark-client] Use a Promise in JobHandleImpl. Makes it integrate better with the RPC layer, and avoids some ugly code in the process. [spark-client] Set up RPC thread factory, add missing timeout. [spark-client] Better handle serialization errors. It's hard to report back serialization errors that happen during an RPC if the RPC header and payload are serialized as a single object. So break those down into two different objects; we're sure the internal RPC header is serializable, so this allows us to better handle the case when the user payload is not serializable and provide nicer errors, making it easier to debug things. Fix buglets. Fix some serialization issues. More fixes. - More types needed serialization tweaks (empty constructors or transient fields) - Fix a race when registering outgoing RPCs. - Add a TODO to fix some very suspicious code in SparkCounter. Fix SparkCounters serialization. This code is a little bit weird; the "Accumulator" instance cannot be serialized by Kryo's default serializer, so we cannot include it when sending SparkCounters back to the RSC. But we can't make the field transient, because apparently it need to be serialized when sent to actual tasks. Long term it would be better to separate internal counter usage from RSC counter views (in the spirit of keeping RSC completely independent from the rest of Hive). But right now make things work by copying things. Configurable channel traffic log level. Avoid an ugly exception when remote end dies. Make rpc server instance private in factory. Add max message size checks. Fix bug when looking for a suitable address. Diffs (updated) ----- pom.xml 630b10ce35032e4b2dee50ef3dfe5feb58223b78 ql/src/java/org/apache/hadoop/hive/ql/exec/spark/LocalHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/spark/status/impl/RemoteSparkJobStatus.java PRE-CREATION spark-client/pom.xml PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/ClientUtils.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/JobHandleImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/Protocol.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/RemoteDriver.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientFactory.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/InputMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/Metrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleReadMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/metrics/ShuffleWriteMetrics.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/KryoMessageCodec.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/README.md PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/Rpc.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcConfiguration.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcDispatcher.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcException.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounter.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounterGroup.java PRE-CREATION spark-client/src/main/java/org/apache/hive/spark/counter/SparkCounters.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/TestSparkClient.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestKryoMessageCodec.java PRE-CREATION spark-client/src/test/java/org/apache/hive/spark/client/rpc/TestRpc.java PRE-CREATION Diff: https://reviews.apache.org/r/28779/diff/ Testing ------- spark-client unit tests, plus some qtests. Thanks, Marcelo Vanzin