[
https://issues.apache.org/jira/browse/SPARK-33620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vladislav Sterkhov updated SPARK-33620:
---------------------------------------
Description:
Hello i have problem with big memory used ~2000gb hdfs stack. With 300gb memory
used task starting and complete, but we need use unlimited stack. Please help
!VlwWJ.png|width=644,height=150!
!mgg1s.png|width=651,height=182!
This my code:
var allTrafficRDD = sparkContext.emptyRDD[String]
for (traffic <- trafficBuffer) {
logger.info("Load traffic path - "+traffic)
val trafficRDD = sparkContext.textFile(traffic)
if (isValidTraffic(trafficRDD, isMasterData))
{ allTrafficRDD = allTrafficRDD.++(filterTraffic(trafficRDD)) }
}
hiveService.insertTrafficRDD(allTrafficRDD.repartition(beforeInsertPartitionsNum),
outTable, isMasterData)
was:
Hello i have problem with big memory used ~2000gb hdfs stack. With 300gb memory
used task starting and complete, but we need use unlimited stack. Please help
!VlwWJ.png!
!mgg1s.png!
This my code:
var allTrafficRDD = sparkContext.emptyRDD[String]
for (traffic <- trafficBuffer) {
logger.info("Load traffic path - "+traffic)
val trafficRDD = sparkContext.textFile(traffic)
if (isValidTraffic(trafficRDD, isMasterData))
{ allTrafficRDD = allTrafficRDD.++(filterTraffic(trafficRDD)) }
}
hiveService.insertTrafficRDD(allTrafficRDD.repartition(beforeInsertPartitionsNum),
outTable, isMasterData)
> Task not started after filtering
> --------------------------------
>
> Key: SPARK-33620
> URL: https://issues.apache.org/jira/browse/SPARK-33620
> Project: Spark
> Issue Type: Question
> Components: Spark Core
> Affects Versions: 2.4.7
> Reporter: Vladislav Sterkhov
> Priority: Major
> Attachments: VlwWJ.png, mgg1s.png
>
>
> Hello i have problem with big memory used ~2000gb hdfs stack. With 300gb
> memory used task starting and complete, but we need use unlimited stack.
> Please help
>
> !VlwWJ.png|width=644,height=150!
>
> !mgg1s.png|width=651,height=182!
>
> This my code:
> var allTrafficRDD = sparkContext.emptyRDD[String]
> for (traffic <- trafficBuffer) {
> logger.info("Load traffic path - "+traffic)
> val trafficRDD = sparkContext.textFile(traffic)
> if (isValidTraffic(trafficRDD, isMasterData))
> { allTrafficRDD = allTrafficRDD.++(filterTraffic(trafficRDD)) }
> }
>
> hiveService.insertTrafficRDD(allTrafficRDD.repartition(beforeInsertPartitionsNum),
> outTable, isMasterData)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]