Thanks, Prasad! This is very helpful. The version we are using already includes Jetty 9 & Zeppelin-820.
However, we still see web hangs quite frequently caused by broadcasting updateNote. I may remove the synchronized block on noteSocketMap as a temporary fix, but I am wondering whether there is a better solution. + zeppelin-dev, since this issue may significantly limit the scalability of Zeppelin if there is a bad connection. One potential optimization can be: instead of lock the whole map, use fine-grained lock on map entries. Best, Johnny On Tue, Jun 28, 2016 at 4:34 AM, Prasad Wagle <prasadwa...@gmail.com> wrote: > Hi Johnny, > > What version of the server are you using? > > You may be interested in the following: > > Email thread > <http://mail-archives.apache.org/mod_mbox/zeppelin-users/201604.mbox/%3CCAMKL62nmLDJRghYB39MdnbAiRT6wvso0eVePKRtKurmu5iM%2BuQ%40mail.gmail.com%3E> > discussing zeppelin server hangs and reducing websocket connections > From the thread: "I removed synchronized (noteSocketMap) from broadcast so > that one bad > socket does not hang the server." This change helps with performance as > well and hasn't caused any problems. However, this is not a long-term > solution and we need to find a better one. > > Jira issue: Reduce websocket communication by unicasting instead of > broadcasting note list (https://issues.apache.org/jira/browse/ZEPPELIN-820 > ) > > Prasad > > On Mon, Jun 20, 2016 at 5:04 PM, Johnny W. <jzw.ser...@gmail.com> wrote: > >> Hi zeppelin-users, >> >> This is my first email to the top-level mailing list. Congratulations for >> graduation! >> >> We are hitting some performance issues when multiple users are connected >> to the Zeppelin server. From the stack trace, many of the connections are >> blocked on a HashMap, which is locked by >> org.apache.zeppelin.socket.NotebookServer.broadcastNote. >> >> Our largest notebook is around 800K, and there are around 10 - 20 >> connections to the Zeppelin server. I think it should be we are >> broadcasting some large amount of data to multiple users, and some slow >> connections hang the whole web interface. >> >> Is there anyway to reduce the number of broadcasts to improve the web >> performance? It is fine for us to refresh and get updates. I've attached >> the full stack trace of this issue as well. >> >> Thanks! >> >> Johnny >> >> Blocking Thread: >> -- >> "qtp1874598090-2478" prio=10 tid=0x00007f2fb0003800 nid=0x3373 waiting on >> condition [0x00007f329ebe9000] >> java.lang.Thread.State: WAITING (parking) >> at sun.misc.Unsafe.park(Native Method) >> - parking to wait for <0x0000000704e15db0> (a >> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) >> at >> java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) >> at >> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) >> at >> org.eclipse.jetty.util.SharedBlockingCallback$Blocker.block(SharedBlockingCallback.java:219) >> at >> org.eclipse.jetty.websocket.common.BlockingWriteCallback$WriteBlocker.block(BlockingWriteCallback.java:83) >> at >> org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.blockingWrite(WebSocketRemoteEndpoint.java:107) >> at >> org.eclipse.jetty.websocket.common.WebSocketRemoteEndpoint.sendString(WebSocketRemoteEndpoint.java:387) >> at >> org.apache.zeppelin.socket.NotebookSocket.send(NotebookSocket.java:69) >> at >> org.apache.zeppelin.socket.NotebookServer.broadcast(NotebookServer.java:304) >> - locked <0x00000007006b6100> (a java.util.HashMap) >> at >> org.apache.zeppelin.socket.NotebookServer.broadcastNote(NotebookServer.java:384) >> at >> org.apache.zeppelin.socket.NotebookServer.updateNote(NotebookServer.java:492) >> at >> org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:181) >> at >> org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(NotebookSocket.java:56) >> at >> org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextMessage(JettyListenerEventDriver.java:128) >> at >> org.eclipse.jetty.websocket.common.message.SimpleTextMessage.messageComplete(SimpleTextMessage.java:69) >> at >> org.eclipse.jetty.websocket.common.events.AbstractEventDriver.appendMessage(AbstractEventDriver.java:65) >> at >> org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextFrame(JettyListenerEventDriver.java:122) >> at >> org.eclipse.jetty.websocket.common.events.AbstractEventDriver.incomingFrame(AbstractEventDriver.java:161) >> at >> org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame(WebSocketSession.java:309) >> at >> org.eclipse.jetty.websocket.common.extensions.ExtensionStack.incomingFrame(ExtensionStack.java:214) >> at >> org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser.java:220) >> at >> org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:258) >> at >> org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.readParse(AbstractWebSocketConnection.java:632) >> at >> org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:480) >> at >> org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544) >> at >> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) >> at >> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) >> at java.lang.Thread.run(Thread.java:745) >> >> Blocked Thread: >> -- >> "qtp1874598090-2498" prio=10 tid=0x00007f306000f800 nid=0x4075 waiting >> for monitor entry [0x00007f329eae9000] >> java.lang.Thread.State: BLOCKED (on object monitor) >> at >> org.apache.zeppelin.socket.NotebookServer.addConnectionToNote(NotebookServer.java:229) >> - waiting to lock <0x00000007006b6100> (a java.util.HashMap) >> at >> org.apache.zeppelin.socket.NotebookServer.sendNote(NotebookServer.java:432) >> at >> org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:145) >> at >> org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(NotebookSocket.java:56) >> at >> org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextMessage(JettyListenerEventDriver.java:128) >> at >> org.eclipse.jetty.websocket.common.message.SimpleTextMessage.messageComplete(SimpleTextMessage.java:69) >> at >> org.eclipse.jetty.websocket.common.events.AbstractEventDriver.appendMessage(AbstractEventDriver.java:65) >> at >> org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextFrame(JettyListenerEventDriver.java:122) >> at >> org.eclipse.jetty.websocket.common.events.AbstractEventDriver.incomingFrame(AbstractEventDriver.java:161) >> at >> org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame(WebSocketSession.java:309) >> at >> org.eclipse.jetty.websocket.common.extensions.ExtensionStack.incomingFrame(ExtensionStack.java:214) >> at >> org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser.java:220) >> at >> org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:258) >> at >> org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.readParse(AbstractWebSocketConnection.java:632) >> at >> org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:480) >> at >> org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544) >> at >> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) >> at >> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) >> at java.lang.Thread.run(Thread.java:745) >> >> >