Greetings,

I am working on a project that needs to process around 100k events per second 
and I'm trying to improve performance.

Most of the classes being used are POJOs but have a couple of fields using a 
`java.util` class, either `ArrayList`, `HashSet` or `SortedSet` etc. This 
forces Flink to use Kyro and throw these warnings:

```
class java.util.ArrayList does not contain a setter for field size
Class class java.util.ArrayList cannot be used as a POJO type because not all 
fields are valid POJO fields, and must be processed as GenericType. Please read 
the Flink documentation on "Data Types & Serialization" for details of the 
effect on performance and schema evolution.
```

```
No fields were detected for class java.util.HashSet so it cannot be used as a 
POJO type and must be processed as GenericType. Please read the Flink 
documentation on "Data Types & Serialization" for details of the effect on 
performance and schema evolution.
I read through the documentation and stackoverflow and the conclusion is that I 
need to make a TypeInfoFactory and use it inside a TypeInfo annotation over my 
POJO.
```

My question is what do I need to do to get Flink to recognize my classes as 
POJOs and use the POJO serializer for better performance?
I read through the documentation and stackoverflow and the conclusion is that I 
need to make a TypeInfoFactory and use it inside a TypeInfo annotation over my 
POJO.
While this seems incredibly tedious and I keep thinking "there must be a better 
way", I would be fine with this solution if I could figure out how to do this 
for the Set types I'm using.

Any help would be appreciated.

Reply via email to