Hello,

We are using the GenericWriteAheadSink to buffer up values to then send to a 
SQL Server database with a fast bulk copy upload. However, when I watch my 
process running it seems to be a huge amount of time iterating the Iterable<T> 
provided to the sendValues() method. It takes such a long time I’ve had to 
increase the checkpoint timeout because it causes the whole workflow to suspend.

I am using Flink 1.14.0 and have attached a simple, self-contained example. If 
I was to guess then there is a very large deserialization overhead from the 
checkpointed data even though I’m currently using a HashMapStateBackend. I have 
profiled the application and it definitely seems to spend most of its time 
there. The object involved is just a plain POJO.

A second “issue” is that I am forced to clone the objects provided by the 
iterator – when I dug into the code I could see a 
ReusingMutableToRegularIteratorWrapper class being using and the objects passed 
were being reused between 2 objects. I don’t know the reasoning behind this 
(except to prevent extra garbage?) but it would be nice if I could specify a 
“non-reusing” one otherwise there is a deserialization AND a clone for every 
object in the list.

Any pointers or advice on a better way to send large amounts of data to a SQL 
Server sink would be appreciated.

James.

The information transmitted is intended only for the person or entity to which 
it is addressed and may contain confidential and/or privileged material. Any 
review, retransmission, dissemination or other use of, or taking of any action 
in reliance upon, this information by persons or entities other than the 
intended recipient is prohibited. If you received this in error, please contact 
the sender and delete the material from any computer.

This communication is for informational purposes only. It is not intended as an 
offer or solicitation for the purchase or sale of any financial instrument or 
as an official confirmation of any transaction. Any market prices, data and 
other information are not warranted as to completeness or accuracy and are 
subject to change without notice. Any comments or statements made herein do not 
necessarily reflect those of Systematica Investments UK LLP, its parents, 
subsidiaries or affiliates.

Systematica Investments UK LLP (“SIUK”), which is authorised and regulated by 
the Financial Conduct Authority of the United Kingdom (the “FCA”) is authorised 
and regulated by the Financial Conduct Authority and is registered with the 
U.S. Securities and Exchange Commission as an investment adviser under the 
Investment Advisers Act of 1940.

Systematica Investments UK LLP is registered in England and Wales with a 
partnership number OC424197. Registered Office: Equitable House, 47 King 
William Street, London EC4R 9AF.

Recipients of this communication should note that electronic communication, 
whether by email, website, SWIFT or otherwise, is an unsafe method of 
communication. Emails and SWIFT messages may be lost, delivered to the wrong 
address, intercepted or affected by delays, interference by third parties or 
viruses and their confidentiality, security and integrity cannot be guaranteed. 
None of SIGPL or any of its affiliates bear any liability or responsibility 
therefor.

Please see the important information at 
www.systematica.com/disclaimer.<http://www.systematica.com/disclaimer>

Please see the important information, including regarding the processing of 
personal data by Systematica, at www.systematica.com/PrivacyNotice.

www.systematica.com<http://www.systematica.com/>

Attachment: SlowBulkCopySinkMain.java
Description: SlowBulkCopySinkMain.java

Reply via email to