[ https://issues.apache.org/jira/browse/SPARK-51685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated SPARK-51685: ----------------------------------- Labels: pull-request-available (was: ) > Excessive Info logging from RocksDb operations casing too big executor stderr > files > ----------------------------------------------------------------------------------- > > Key: SPARK-51685 > URL: https://issues.apache.org/jira/browse/SPARK-51685 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming > Affects Versions: 4.1.0 > Reporter: Vinod KC > Priority: Minor > Labels: pull-request-available > > Long-running structured streaming applications with RocksDb statestore is > failing after some time due to "No space left on device" error. > Checked the executor logs and noticed that the volume of executor logs is > crazy. Around 1GB of data for 1 minute of run. > Each info log entry were printing more than 4000 lines (list for files in > RockDB operations) > Eg: > {code:java} > 25/03/19 10:38:32 INFO RocksDBFileManager > [StateStoreId(opId=0,partId=734,name=default), > queryRunId=30d232ab-0d2a-454d-a217-5b2debfe10e9, > queryId=4c8ca990-9fd4-4b69-8f90-ce933b3f4ef3]: Saving checkpoint files for > version 285264 - 2912 files > /local_disk0/spark-fe801ff3-f023-4554-8237-8914d37c1485/executor-41cadbd8-1863-4fc9-986d-607fa1f6d797/spark-7990e75a-1c8e-48a6-b59f-b174342e7837/[StateStoreId(opId=0,partId=734,name=default), > queryRunId=30d232ab-0d2a-454d-a217-5b2debfe10e9, > queryId=4c8ca990-9fd4-4b69-8f90-ce933b3f4ef3]-506ec0a7-fd75-4a75-9b09-717a64ee3478/checkpoint-42a1f8e8-a467-4c6a-9581-401d352a46bf/005040.log > - 3135 bytes > /local_disk0/spark-fe801ff3-f023-4554-8237-8914d37c1485/executor-41cadbd8-1863-4fc9-986d-607fa1f6d797/spark-7990e75a-1c8e-48a6-b59f-b174342e7837/[StateStoreId(opId=0,partId=734,name=default), > queryRunId=30d232ab-0d2a-454d-a217-5b2debfe10e9, > queryId=4c8ca990-9fd4-4b69-8f90-ce933b3f4ef3]-506ec0a7-fd75-4a75-9b09-717a64ee3478/checkpoint-42a1f8e8-a467-4c6a-9581-401d352a46bf/013865.log > - 1368 bytes > ....{code} > Here Info logs printing all checkpoint file paths. These verbos info logs > affects streaming application disk space as most users keep Info as the > default log level. > So it is better to use Debug log to log such details. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org