Thanks for taking care!

On Sun, Oct 9, 2016 at 4:05 PM DuyHai Doan <doanduy...@gmail.com> wrote:

> I have created a JIRA epic to track down all the task:
> https://issues.apache.org/jira/browse/ZEPPELIN-1525
>
> I think I would start by the synchronize blocks and then move onto Eric's
> PR for Guice DI.
>
> After we have a DI mechanism, it will be much easier to inject thread pools
> for thread management and also to create JMX monitoring
>
> Any objection before I start coding ?
>
>
>
> On Sat, Oct 8, 2016 at 2:05 PM, Eric Charles <e...@apache.org> wrote:
>
> > On 04/10/16 12:54, Anthony Corbacho wrote:
> >
> >> You made my day, this is the kind of email i really like !!
> >>
> >> I think its a great idea and i am willing to spend sometime on it.
> >>
> >> I also want to move to a DI (guice) architecture , let me know what you
> >> think about it.
> >>
> >
> > A PR is opened for Guice DI. If someone jumps for review, I can rebase
> >
> > https://github.com/apache/zeppelin/pull/1361
> >
> >
> >
> >
> >> On Tuesday, 4 October 2016, DuyHai Doan <doanduy...@gmail.com> wrote:
> >>
> >> Hello devs
> >>>
> >>> The code base of Zeppelin has grown very fast in the last 12 months and
> >>> it's great. It means that we have more and more contributors.
> >>>
> >>> However, to make the project maintainable at long term, we need regular
> >>> code refactoring.
> >>>
> >>> I have some ideas to share with you
> >>>
> >>> 1) Use Java 8 to benefit from Lambda & streams.
> >>>
> >>>   Now that Java 8 is well established, it is a good time to upgrade the
> >>> project. I believe some interpreters also need Java 8. Cassandra
> >>> interpreter right now does not have unit tests for the latest features
> >>> because the Embedded Cassandra server used for testing requires Java 8.
> >>>
> >>>  It would also be a good opportunity to go through the code base and
> >>> replace some boilerplate for() loop with manual filtering by the stream
> >>> shortcut :  list.stream().filter(..).map(). It would improve greatly
> >>> code
> >>> readability
> >>>
> >>> 2) Multi threading
> >>>
> >>>  I've seen the usage of synchronize block at a few places in the code
> >>> base.
> >>> Although perfectly valid, it has a cost at runtime and since more and
> >>> more
> >>> people are asking for multi-tenancy or using a single Zeppelin instance
> >>> to
> >>> server multiple users, I guess the synchronized blocks has a huge cost.
> >>>
> >>> There are some solid alternatives:
> >>>
> >>>  - ConcurrentHashMap if we synchronized on a map
> >>>  - CopyOnWriteArrayList if we synchronized on a list.
> >>>
> >>> Of cours each sychronize block should be taken carefully not to
> introduce
> >>> regression
> >>>
> >>> 3) Thread management
> >>>
> >>> I've seen some usage of new Thread() {...}.run(); it may be a good time
> >>> to
> >>> introduce ThreadPool and pass them along (inside context objects for
> >>> example) to have a more centralized thread management
> >>>
> >>> The advantage of having thread pool is that we can manage them in a
> >>> single
> >>> place, monitor them and expose the info through JMX and also control
> >>> system
> >>> resource by defining max thread number and thread pool queue
> >>>
> >>> 4) Server monitoring
> >>> I hear many users on the field complain about the fact that they have
> to
> >>> restart Zeppelin server regularly because it "hangs" after running a
> long
> >>> time.
> >>>
> >>> If we can expose some system metrics through JMX, it would help people
> >>> monitor the state of Zeppelin server and take appropriate actions
> >>>
> >>> Right now we may only focus on monitoring the server itself, not the
> >>> interpreter JVMs processes. It can be done in a 2nd step
> >>>
> >>>
> >>> What do you think about the ideas ?
> >>>
> >>>
> >>
>

Reply via email to