[ 
https://issues.apache.org/jira/browse/SPARK-51685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-51685:
-----------------------------------
    Labels: pull-request-available  (was: )

> Excessive Info logging from RocksDb operations casing too big executor stderr 
> files
> -----------------------------------------------------------------------------------
>
>                 Key: SPARK-51685
>                 URL: https://issues.apache.org/jira/browse/SPARK-51685
>             Project: Spark
>          Issue Type: Improvement
>          Components: Structured Streaming
>    Affects Versions: 4.1.0
>            Reporter: Vinod KC
>            Priority: Minor
>              Labels: pull-request-available
>
> Long-running structured streaming applications with RocksDb statestore is 
> failing after some time due to "No space left on device" error.
> Checked the executor logs and noticed that the volume of executor logs is 
> crazy. Around 1GB of data for 1 minute of run. 
> Each info log entry were printing more than 4000 lines (list for files in 
> RockDB operations)
> Eg:
> {code:java}
> 25/03/19 10:38:32 INFO RocksDBFileManager 
> [StateStoreId(opId=0,partId=734,name=default), 
> queryRunId=30d232ab-0d2a-454d-a217-5b2debfe10e9, 
> queryId=4c8ca990-9fd4-4b69-8f90-ce933b3f4ef3]: Saving checkpoint files for 
> version 285264 - 2912 files 
> /local_disk0/spark-fe801ff3-f023-4554-8237-8914d37c1485/executor-41cadbd8-1863-4fc9-986d-607fa1f6d797/spark-7990e75a-1c8e-48a6-b59f-b174342e7837/[StateStoreId(opId=0,partId=734,name=default),
>  queryRunId=30d232ab-0d2a-454d-a217-5b2debfe10e9, 
> queryId=4c8ca990-9fd4-4b69-8f90-ce933b3f4ef3]-506ec0a7-fd75-4a75-9b09-717a64ee3478/checkpoint-42a1f8e8-a467-4c6a-9581-401d352a46bf/005040.log
>  - 3135 bytes 
> /local_disk0/spark-fe801ff3-f023-4554-8237-8914d37c1485/executor-41cadbd8-1863-4fc9-986d-607fa1f6d797/spark-7990e75a-1c8e-48a6-b59f-b174342e7837/[StateStoreId(opId=0,partId=734,name=default),
>  queryRunId=30d232ab-0d2a-454d-a217-5b2debfe10e9, 
> queryId=4c8ca990-9fd4-4b69-8f90-ce933b3f4ef3]-506ec0a7-fd75-4a75-9b09-717a64ee3478/checkpoint-42a1f8e8-a467-4c6a-9581-401d352a46bf/013865.log
>  - 1368 bytes  
>  ....{code}
> Here Info logs printing all checkpoint file paths. These verbos info logs 
> affects streaming application disk space as most users keep Info as the 
> default log level.
> So it is better to use Debug log to log such details.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to