GitHub user HeartSaVioR opened a pull request:
https://github.com/apache/incubator-zeppelin/pull/575
ZEPPELIN-534 Discard broken thrift Client instance
### What is this PR for?
Zeppelin has been reused broken thrift client instances.
Since we can catch TException, we can discard client instances which throws
TException from client pool.
### What type of PR is it?
Bug Fix | Improvement
### Todos
### Is there a relevant Jira issue?
https://issues.apache.org/jira/browse/ZEPPELIN-534
### How should this be tested?
1. run notebook which uses spark interpreter
2. kill spark interpreter with -9
3. run notebook which uses killed interpreter
4. run same notebook again and see error log has changed
output of 3
```
java.net.SocketException: Connection reset at
java.net.SocketInputStream.read(SocketInputStream.java:196) at
java.net.SocketInputStream.read(SocketInputStream.java:122) at
java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at
java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at
java.io.BufferedInputStream.read(BufferedInputStream.java:334) at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_interpret(RemoteInterpreterService.java:220)
at org.apache.zeppelin.inte
rpreter.thrift.RemoteInterpreterService$Client.interpret(RemoteInterpreterService.java:205)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:225)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:211) at
org.apache.zeppelin.scheduler.Job.run(Job.java:169) at
org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:322)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at
java.util.concurrent.FutureTask.run(FutureTask.java:262) at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Work
er.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
```
output of 4
```
java.net.ConnectException: Connection refused at
java.net.PlainSocketImpl.socketConnect(Native Method) at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at
java.net.Socket.connect(Socket.java:579) at
org.apache.thrift.transport.TSocket.open(TSocket.java:182) at
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:51)
at
org.apache.zeppelin.interpreter.remote.ClientFactory.create(ClientFactory.java:37)
at
org.apache.commons.pool2.BasePooledObjectFactory.makeObject(BasePooledObjectFactory.java:60)
at
org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861)
at
org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435)
at org.apache.commons.pool2.impl
.GenericObjectPool.borrowObject(GenericObjectPool.java:363) at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.getClient(RemoteInterpreterProcess.java:140)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:205)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:211) at
org.apache.zeppelin.scheduler.Job.run(Job.java:169) at
org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:322)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at
java.util.concurrent.FutureTask.run(FutureTask.java:262) at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWork
er(ThreadPoolExecutor.java:1145) at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
```
Result could be different how many client instances pool makes at initial
phase.
Before applying this, output of 4 would be ```broken pipe```, which means
it doesn't discard previous client instance.
### Screenshots (if appropriate)
### Questions:
* Does the licenses files need update? (No)
* Is there breaking changes for older versions? (No)
* Does this needs documentation? (No)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/HeartSaVioR/incubator-zeppelin ZEPPELIN-534
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-zeppelin/pull/575.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #575
----
commit a84d0ebbcf2ca2d355582956f90213a014fe2059
Author: Jungtaek Lim <[email protected]>
Date: 2015-12-28T07:00:49Z
ZEPPELIN-534 Discard broken thrift Client instance
* We can treat client as broken when TException occurs
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---