Hello!

I'm a software developer and as part of a project I require to extend the
functionality of SparkInterpreter without modifying it. I need instead
create a new interpreter that extends it or wrap its functionality.

I also need the spark sub-interpreters to use my new custom interpreter,
but the problem comes here, because the spark sub-interpreters has a direct
dependency to spark interpreter as they use the class name of spark
interpreter to obtain its instance:


    private SparkInterpreter getSparkInterpreter() {

...

        Interpreter p =
getInterpreterInTheSameSessionByClassName(SparkInterpreter.class.getName());

}


*Approach without modify apache zeppelin*

My current approach to solve is to create a SparkCustomInterpreter that
override the getClassName method as follows:

public class SparkCustomInterpreter extends SparkInterpreter {
    ...

    @Override
    public String getClassName() {
        return SparkInterpreter.class.getName();
    }
}


and put the new class name in the interpreter-setting.json file of spark:

[
  {
    "group": "spark",
    "name": "spark",
    "className": "org.apache.zeppelin.spark.SparkCustomInterpreter",
    ...
    "properties": {...}
  }, ...
]


The problem with this approach is that when I run a paragraph it fails. In
general it fails because zeppelin uses both the class name of the instance
and the getClassName() method to access the instance, and that causes many
problems.

*Approaches modifying apache zeppelin*

There are two possible solutions related with the way in which the
sub-interpreters get the SparkInterpreter instance class, one is getting
the class name from a property:


    private SparkInterpreter getSparkInterpreter() {

...

        Interpreter p =
getInterpreterInTheSameSessionByClassName(*property.getProperty("zeppelin.spark.mainClass",
SparkInterpreter.class.getName())* );

}

And the other possibility is to modify the method Interpreter.
getInterpreterInTheSameSessionByClassName(String) in order to return the
instance that whether has the same class name specified in the parameter or
which super class has the same class name specified in the parameter:


@ZeppelinApi
public Interpreter getInterpreterInTheSameSessionByClassName(String className) {
  synchronized (interpreterGroup) {
    for (List<Interpreter> interpreters : interpreterGroup.values()) {
      ....
      for (Interpreter intp : interpreters) {
        if (intp.getClassName().equals(className) *||
intp.getClass().getSuperclass().getName().equals(className)*) {
          interpreterFound = intp;
        }

        ...
      }

      ...
    }
  }
  return null;
}


Either of the two solutions would involve the modification of apache
zeppelin code; do you think the change could be contributed to the
community?, or maybe do you realize some other approach to change the way
in which sub-interpreters of spark get the instance of spark interpreter?

Any information about it I'll be attempt.

Greetings

Jhon

Reply via email to