Hi,

I am using MLlib collaborative filtering API on an implicit preference data
set. From a pySpark notebook, I am iteratively creating the matrix
factorization model with the aim of measuring the RMSE for each combination
of parameters for this API like the rank, lambda and alpha. After the code
successfully completed six iterations, on the seventh call of the
ALS.trainImplicit API, I get a confusing exception that says py4j cannot
find the method trainImplicitALSmodel.  The full trace is included at the
end of the email.

I am running Spark over YARN (yarn-client mode) with five executors. This
error seems to be happening completely on the driver as I don't see any
error on the Spark web interface. I have tried changing the
spark.yarn.am.memory configuration value, but it doesn't help. Any
suggestion on how to debug this will be very helpful.

Thank you,
Sooraj

Here is the full error trace:

---------------------------------------------------------------------------Py4JError
                                Traceback (most recent call
last)<ipython-input-8-ad6ca35e7521> in <module>()      3       4 for
index, (r, l, a, i) in enumerate(itertools.product(ranks, lambdas,
alphas, iters)):----> 5     model = ALS.trainImplicit(scoreTableTrain,
rank = r, iterations = i, lambda_ = l, alpha = a)      6       7
predictionsTrain = model.predictAll(userProductTrainRDD)
/usr/local/spark-1.4/spark-1.4.0-bin-hadoop2.6/python/pyspark/mllib/recommendation.pyc
in trainImplicit(cls, ratings, rank, iterations, lambda_, blocks,
alpha, nonnegative, seed)    198
nonnegative=False, seed=None):    199         model =
callMLlibFunc("trainImplicitALSModel", cls._prepare(ratings), rank,-->
200                               iterations, lambda_, blocks, alpha,
nonnegative, seed)    201         return
MatrixFactorizationModel(model)    202
/usr/local/spark-1.4/spark-1.4.0-bin-hadoop2.6/python/pyspark/mllib/common.pyc
in callMLlibFunc(name, *args)    126     sc =
SparkContext._active_spark_context    127     api =
getattr(sc._jvm.PythonMLLibAPI(), name)--> 128     return
callJavaFunc(sc, api, *args)    129     130
/usr/local/spark-1.4/spark-1.4.0-bin-hadoop2.6/python/pyspark/mllib/common.pyc
in callJavaFunc(sc, func, *args)    119     """ Call Java Function """
   120     args = [_py2java(sc, a) for a in args]--> 121     return
_java2py(sc, func(*args))    122     123
/usr/local/lib/python2.7/site-packages/py4j/java_gateway.pyc in
__call__(self, *args)    536         answer =
self.gateway_client.send_command(command)    537         return_value
= get_return_value(answer, self.gateway_client,--> 538
self.target_id, self.name)    539     540         for temp_arg in
temp_args:
/usr/local/lib/python2.7/site-packages/py4j/protocol.pyc in
get_return_value(answer, gateway_client, target_id, name)    302
          raise Py4JError(    303                     'An error
occurred while calling {0}{1}{2}. Trace:\n{3}\n'.--> 304
      format(target_id, '.', name, value))    305         else:    306
            raise Py4JError(
Py4JError: An error occurred while calling o667.trainImplicitALSModel. Trace:
py4j.Py4JException: Method trainImplicitALSModel([class
org.apache.spark.api.java.JavaRDD, class java.lang.Integer, class
java.lang.Integer, class java.lang.Integer, class java.lang.Integer,
class java.lang.Double, class java.lang.Boolean, null]) does not exist
        at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:333)
        at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:342)
        at py4j.Gateway.invoke(Gateway.java:252)
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:207)
        at java.lang.Thread.run(Thread.java:724)

Reply via email to