Hi guys, I am getting invalid utf-32 character exceptions in the Elasticsearch sink. It happens on very busy flume servers and as a result the whole delivery of messages to elasticsearch grinds to a halt. I am not sure if it is related to this issue https://issues.apache.org/jira/browse/FLUME-2089 Since all the message collection stops because of this I would rather drop these events than have no messages collected. I am thinking to use the regex filtering interceptor for this purpose.
Here is the error message: 26 Aug 2014 14:18:38,279 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.SinkRunner$PollingRunner.run:160) - Unable to deliver event. Exception follows. org.apache.flume.EventDeliveryException: Failed to commit transaction. Transaction rolled back. at org.apache.flume.sink.elasticsearch.ElasticSearchSink.process(ElasticSearchSink.java:227) at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) at java.lang.Thread.run(Thread.java:744) Caused by: java.io.CharConversionException: Invalid UTF-32 character 0x27b0001(above 10ffff) at char #1, byte #7) at org.elasticsearch.common.jackson.core.io.UTF32Reader.reportInvalid(UTF32Reader.java:155) at org.elasticsearch.common.jackson.core.io.UTF32Reader.read(UTF32Reader.java:109) at org.elasticsearch.common.jackson.core.json.ReaderBasedJsonParser.loadMore(ReaderBasedJsonParser.java:131) at org.elasticsearch.common.jackson.core.json.ReaderBasedJsonParser._skipWSOrEnd(ReaderBasedJsonParser.java:1688) at org.elasticsearch.common.jackson.core.json.ReaderBasedJsonParser.nextToken(ReaderBasedJsonParser.java:562) at org.elasticsearch.common.xcontent.json.JsonXContentParser.nextToken(JsonXContentParser.java:50) at org.apache.flume.sink.elasticsearch.ContentBuilderUtil.addComplexField(ContentBuilderUtil.java:62) at org.apache.flume.sink.elasticsearch.ContentBuilderUtil.appendField(ContentBuilderUtil.java:47) at org.apache.flume.sink.elasticsearch.ElasticSearchLogStashEventSerializer.appendHeaders(ElasticSearchLogStashEventSerializer.java:131) at org.apache.flume.sink.elasticsearch.ElasticSearchLogStashEventSerializer.getContentBuilder(ElasticSearchLogStashEventSerializer.java:80) at org.apache.flume.sink.elasticsearch.ElasticSearchLogStashEventSerializer.getContentBuilder(ElasticSearchLogStashEventSerializer.java:73) at org.apache.flume.sink.elasticsearch.client.ElasticSearchTransportClient.addEvent(ElasticSearchTransportClient.java:164) at org.apache.flume.sink.elasticsearch.ElasticSearchSink.process(ElasticSearchSink.java:189) ... 3 more