Seems I am still having the same issue in different scenarios. Using the 'dylanmei/docker-zeppelin' container I get the same issue as before when trying to create a spark dataframe from a pandas dataframe.
code: %pyspark import pandas as pd names = ['Bob','Jessica','Mary','John','Mel'] births = [968, 155, 77, 578, 973] BabyDataSet = zip(names,births) df = pd.DataFrame(data = BabyDataSet, columns=['Names', 'Births']) rdf = sqlc.createDataFrame(df) result: (<class 'py4j.protocol.Py4JJavaError'>, Py4JJavaError(u'An error occurred while calling z:org.apache.spark.api.python.PythonRDD.runJob.\n', JavaObject id=o49), <traceback object at 0x7f0c819dce60>) -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-3-createDataframe-error-with-pandas-df-tp22053p22809.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
