Hi all,

I'm using Flink 1.12 and a custom partitioner/partitioning key (batch mode, 
with a DataSet) to do a better job of distributing data to tasks. The classes 
look like:

public class MyPartitioner implements Partitioner<MyGroupingKey> 
{
    ...
}

public class MyGroupingKey implements Comparable<MyGroupingKey> 
{
    ...
}

This worked fine, but I noticed a warning logged by Flink about MyGroupingKey 
not having an empty constructor, and thus not being treated as a POJO.

I added that empty constructor, and then I got an error because 
partitionCustom() only works on a single field key.

So I changed MyGroupingKey to have a single field (a string), with transient 
cached values for the pieces of the key that I need while partitioning. Now I 
get an odd error:

java.lang.RuntimeException: Error while calling custom partitioner

Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to 
MyGroupingKey
        at MyPartitioner.partition(AdsPinotFilePartitioner.java:11)
        at 
org.apache.flink.runtime.operators.shipping.OutputEmitter.customPartition(OutputEmitter.java:235)
        ... 19 more

So I've got two questions…

        • Should I just get rid of the empty constructor, and have Flink treat 
it as a non-POJO? This seemed to be working fine.
        • Is it a bug in Flink that the extracted field from the key is being 
used as the expected type for partitioning?

Thanks!

— Ken

--------------------------
Ken Krugler
http://www.scaleunlimited.com
Custom big data solutions
Flink, Pinot, Solr, Elasticsearch



Reply via email to