Vinod KC created SPARK-51685: -------------------------------- Summary: Excessive Info logging from RocksDb operations casing too big executor stderr files Key: SPARK-51685 URL: https://issues.apache.org/jira/browse/SPARK-51685 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 4.1.0 Reporter: Vinod KC
Long-running structured streaming applications with RocksDb statestore is failing after some time due to "No space left on device" error. Checked the executor logs and noticed that the volume of executor logs is crazy. Around 1GB of data for 1 minute of run. Each info log entry were printing more than 4000 lines (list for files in RockDB operations) Eg: {code:java} 25/03/19 10:38:32 INFO RocksDBFileManager [StateStoreId(opId=0,partId=734,name=default), queryRunId=30d232ab-0d2a-454d-a217-5b2debfe10e9, queryId=4c8ca990-9fd4-4b69-8f90-ce933b3f4ef3]: Saving checkpoint files for version 285264 - 2912 files /local_disk0/spark-fe801ff3-f023-4554-8237-8914d37c1485/executor-41cadbd8-1863-4fc9-986d-607fa1f6d797/spark-7990e75a-1c8e-48a6-b59f-b174342e7837/[StateStoreId(opId=0,partId=734,name=default), queryRunId=30d232ab-0d2a-454d-a217-5b2debfe10e9, queryId=4c8ca990-9fd4-4b69-8f90-ce933b3f4ef3]-506ec0a7-fd75-4a75-9b09-717a64ee3478/checkpoint-42a1f8e8-a467-4c6a-9581-401d352a46bf/005040.log - 3135 bytes /local_disk0/spark-fe801ff3-f023-4554-8237-8914d37c1485/executor-41cadbd8-1863-4fc9-986d-607fa1f6d797/spark-7990e75a-1c8e-48a6-b59f-b174342e7837/[StateStoreId(opId=0,partId=734,name=default), queryRunId=30d232ab-0d2a-454d-a217-5b2debfe10e9, queryId=4c8ca990-9fd4-4b69-8f90-ce933b3f4ef3]-506ec0a7-fd75-4a75-9b09-717a64ee3478/checkpoint-42a1f8e8-a467-4c6a-9581-401d352a46bf/013865.log - 1368 bytes ....{code} Here Info logs printing all checkpoint file paths. These verbos info logs affects streaming application disk space as most users keep Info as the default log level. So it is better to use Debug log to log such details. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org