Hi, We are doing some performance testing on a 12 node cluster with 8 task slots per TM. Every 15 minutes or so, the job would run into the following exception.
java.lang.IllegalArgumentException: Illegal value provided for SubCode. at org.rocksdb.Status$SubCode.getSubCode(Status.java:109) at org.rocksdb.Status.<init>(Status.java:30) at org.rocksdb.RocksDB.put(Native Method) at org.rocksdb.RocksDB.put(RocksDB.java:511) at org.apache.flink.contrib.streaming.state.AbstractRocksDBAppendingState.updateInternal(AbstractRocksDBAppendingState.java:80) at org.apache.flink.contrib.streaming.state.RocksDBReducingState.add(RocksDBReducingState.java:99) at org.apache.flink.streaming.runtime.operators.windowing.WindowOperator.processElement(WindowOperator.java:358) at org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:202) at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:105) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) at java.lang.Thread.run(Thread.java:745) I saw an outstanding issue with similar exception in [1]. The ticket description suggests that it was due to out of disk error, but in our case, we have plenty of disk left on all TMs. Has anyone run into this before? If so, is there a fix or workaround? Thanks, [1] https://issues.apache.org/jira/browse/FLINK-9233 -- Ning