Re: shuffle files not deleted after executor restarted

汪洋 Fri, 02 Sep 2016 03:02:26 -0700
> 在 2016年9月2日，下午5:58，汪洋 <tiandiwo...@icloud.com> 写道：
> 
> Yeah, using external shuffle service is a reasonable choice but I think we 
> will still face the same problems. We use SSDs to store shuffle files for 
> performance considerations. If the shuffle files are not going to be used 
> anymore, we want them to be deleted instead of taking up valuable SSD space.
> 
Not very familiar with external shuffle service though. Is it going to help in 
this case? -:)
>> 在 2016年9月2日，下午5:40，Artur Sukhenko <artur.sukhe...@gmail.com 
>> <mailto:artur.sukhe...@gmail.com>> 写道：
>> 
>> Hi Yang,
>> 
>> Isn't external shuffle service better for long running applications? 
>> "It runs as a standalone application and manages shuffle output files so 
>> they are available for executors at all time"
>> 
>> It is described here:
>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-ExternalShuffleService.html
>>  
>> <https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-ExternalShuffleService.html>
>> 
>> ---
>> Artur
>> 
>> On Fri, Sep 2, 2016 at 12:30 PM 汪洋 <tiandiwo...@icloud.com 
>> <mailto:tiandiwo...@icloud.com>> wrote:
>> Thank you for you response. 
>> 
>> We are using spark-1.6.2 on standalone deploy mode with dynamic allocation 
>> disabled.
>> 
>> I have traced the code. IMHO, it seems this cleanup is not handled by 
>> shutdown hooks directly. The shutdown hooks only send a 
>> “ExecutorStateChanged” message to the worker and if the worker see the 
>> message, it will cleanup the directory only when this application is 
>> finished. In our case, the application is not finished (long running). The 
>> executor exits due to some unknown error and it is restarted by worker right 
>> away. In this scenario, those old directories are not going to be deleted. 
>> 
>> If the application is still running, is it safe to delete the old “blockmgr” 
>> directory and leaving only the newest one?
>> 
>> Our temporary solution is to restart our application regularly and we are 
>> seeking a more elegant way. 
>> 
>> Thanks.
>> 
>> Yang
>> 
>> 
>>> 在 2016年9月2日，下午4:11，Sun Rui <sunrise_...@163.com 
>>> <mailto:sunrise_...@163.com>> 写道：
>>> 
>>> Hi,
>>> Could you give more information about your Spark environment? cluster 
>>> manager, spark version, using dynamic allocation or not, etc..
>>> 
>>> Generally, executors will delete temporary directories for shuffle files on 
>>> exit because JVM shutdown hooks are registered. Unless they are brutally 
>>> killed.
>>> 
>>> You can safely delete the directories when you are sure that the spark 
>>> applications related to them have finished. A crontab task may be used for 
>>> automatic clean up.
>>> 
>>>> On Sep 2, 2016, at 12:18, 汪洋 <tiandiwo...@icloud.com 
>>>> <mailto:tiandiwo...@icloud.com>> wrote:
>>>> 
>>>> Hi all,
>>>> 
>>>> I discovered that sometimes executor exits unexpectedly and when it is 
>>>> restarted, it will create another blockmgr directory without deleting the 
>>>> old ones. Thus, for a long running application, some shuffle files will 
>>>> never be cleaned up. Sometimes those files could take up the whole disk. 
>>>> 
>>>> Is there a way to clean up those unused file automatically? Or is it safe 
>>>> to delete the old directory manually only leaving the newest one？
>>>> 
>>>> Here is the executor’s local directory.
>>>> <D7718580-FF26-47F8-B6F8-00FB1F20A8C0.png>
>>>> 
>>>> Any advice on this?
>>>> 
>>>> Thanks.
>>>> 
>>>> Yang
>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org 
>>> <mailto:dev-unsubscr...@spark.apache.org>
>>> 
>> 
>> -- 
>> --
>> Artur Sukhenko
>
Re: shuffle files not deleted after executor restarted

Reply via email to