Yes, we will use the latest sink interface. Best, Mats
> On 6. Jan 2021, at 11:05, Aljoscha Krettek <aljos...@apache.org> wrote: > > It's great to see interest in this. Where you planning to use the new Sink > interface that we recently introduced? [1] > > Best, > Aljoscha > > [1] https://s.apache.org/FLIP-143 > > On 2021/01/05 12:21, Poerschke, Mats wrote: >> Hi all, >> >> we want to contribute a sink connector for Apache Pinot. The following >> briefly describes the planned control flow. Please feel free to comment on >> any of its aspects. >> >> Background >> Apache Pinot is a large-scale real-time data ingestion engine working on >> data segments internally. The controller exposes an external API which >> allows posting new segments via REST call. A thereby posted segment must >> contain an id (called segment name). >> >> Control Flow >> The Flink sink will collect data tuples locally. When creating a checkpoint, >> all those tuples are grouped into one segment which is then pushed to the >> Pinot controller. We will assign each pushed segment a unique incrementing >> identifier. >> After receiving a success response from the Pinot controller, the latest >> segment name is persisted within the Flink checkpoint. >> In case we have to recover from a failure, the latest successfully pushed >> segment name can be reconstructed from the Flink checkpoint. At this point >> the system might be in an inconsistent state. The Pinot controller might >> have already stored a newer segment (which’s name was, due to the failure, >> not persisted in the flink checkpoint). >> This inconsistency is resolved with the next successful checkpoint creation. >> The there pushed segment will get the same segment name assigned as the >> inconsistent segment. Thus, Pinot replaces the old with the new segment >> which prevents introducing duplicate entries. >> >> >> Best regards >> Mats Pörschke >>