[ 
https://issues.apache.org/jira/browse/SPARK-12261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15934218#comment-15934218
 ] 

Tomas Pranckevicius edited comment on SPARK-12261 at 3/29/17 12:28 PM:
-----------------------------------------------------------------------

Thank you Shea for the details. These solutions will not necessarily apply to 
my situation, but this is clear - these solutions does not solve my problem. 
There is new project called IBM systemML that might solve these issues, because 
most probably the current version of MLlib does not support automatic 
optimization based on data and cluster characteristics to ensure efficiency and 
scalability. So lets see what is happening on Apache SystemML. More: 
http://systemml.apache.org


was (Author: tomas pranckevicius):
Thank you Shea for the details. These solutions will not necessarily apply to 
my situation, but this is clear - these solutions does not solve my problem.

> pyspark crash for large dataset
> -------------------------------
>
>                 Key: SPARK-12261
>                 URL: https://issues.apache.org/jira/browse/SPARK-12261
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 1.5.2
>         Environment: windows
>            Reporter: zihao
>
> I tried to import a local text(over 100mb) file via textFile in pyspark, when 
> i ran data.take(), it failed and gave error messages including:
> 15/12/10 17:17:43 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; 
> aborting job
> Traceback (most recent call last):
>   File "E:/spark_python/test3.py", line 9, in <module>
>     lines.take(5)
>   File "D:\spark\spark-1.5.2-bin-hadoop2.6\python\pyspark\rdd.py", line 1299, 
> in take
>     res = self.context.runJob(self, takeUpToNumLeft, p)
>   File "D:\spark\spark-1.5.2-bin-hadoop2.6\python\pyspark\context.py", line 
> 916, in runJob
>     port = self._jvm.PythonRDD.runJob(self._jsc.sc(), mappedRDD._jrdd, 
> partitions)
>   File "C:\Anaconda2\lib\site-packages\py4j\java_gateway.py", line 813, in 
> __call__
>     answer, self.gateway_client, self.target_id, self.name)
>   File "D:\spark\spark-1.5.2-bin-hadoop2.6\python\pyspark\sql\utils.py", line 
> 36, in deco
>     return f(*a, **kw)
>   File "C:\Anaconda2\lib\site-packages\py4j\protocol.py", line 308, in 
> get_return_value
>     format(target_id, ".", name), value)
> py4j.protocol.Py4JJavaError: An error occurred while calling 
> z:org.apache.spark.api.python.PythonRDD.runJob.
> : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
> in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 
> (TID 0, localhost): java.net.SocketException: Connection reset by peer: 
> socket write error
> Then i ran the same code for a small text file, this time .take() worked fine.
> How can i solve this problem?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to