Re: Can FIFO compaction with RocksDB result in data loss?

2022-07-04 Thread Hangxiang Yu
Hi, Vishal. IIUC, 1. FIFO compaction drops the old data by the configured size in L0, so the old data may be dropped but we could not know. That's why "it's basically a TTL compaction style and It is suited for keeping event log data with very low overhead (query log for example)". If it's the user

Re: [CEP] State compatibility when a pattern is modified

2022-07-04 Thread Dian Fu
Hi Nicolas, The state isn't compatible, besides, as the partial matches will also not be dropped and so the behavior is undefined. The original events will be dropped after being evaluated and so when the pattern changes, there is no way to evaluate them against the new pattern. Regards, Dian On

Re: Can FIFO compaction with RocksDB result in data loss?

2022-07-04 Thread Zhanghao Chen
Hi Vishal, FIFO compaction with RocksDB can result in data loss as it discards the oldest SST file by a size-based trigger or based on TTL (an internal RocksDB option, irrelevant to the TTL setting in Flink). So use with cause. I'm not sure why you observed SST file compactions, did you manuall

Re: How to mock new DataSource/Sink

2022-07-04 Thread Alexander Fedulov
Hi David, I started working on FLIP-238 exactly with the concerns you've mentioned in mind. It is currently in development, feel free to join the discussion [1]. If you need something ASAP and are not interested in rate-limiting functionality, you could drop in this [2] class into your tests suite

Re: Can FIFO compaction with RocksDB result in data loss?

2022-07-04 Thread Alexander Fedulov
Hi Vishal, I am not sure I get what you mean by the question #2: >2. SST files get created each time a checkpoint is triggered. At this point, does the data for a given key get merged in case the initial data was read from an SST file while the update must have happened in memory? Could you maybe

Re: ContinuousFileMonitoringFunction retrieved invalid state.

2022-07-04 Thread Vishal Surana
Wow! This is bad! I am using reactive mode and this is indeed the issue. This should have been urgently patched as jobs with upgraded Flink version are in very precarious position. With all the other upgrades (rocksdb, etc.) going into 1.15.0 there's no easy rollback. On Fri, Jul 1, 2022 at 8:14 A

Can FIFO compaction with RocksDB result in data loss?

2022-07-04 Thread Vishal Surana
In my load tests, I've found FIFO compaction to offer the best performance as my job needs state only for so long. However, this particular statement in RocksDB documentation concerns me: "Since we never rewrite the key-value pair, we also don't ever apply the compaction filter on the keys." This

[CEP] State compatibility when a pattern is modified

2022-07-04 Thread Nicolas Richard
Hello! What happens to partial matches if I deploy a new version of a CEP application with a modified pattern. * Application v1 looks for pattern a b c * Application v2 looks for pattern a b+ d c Is state compatible? Are partial matches dropped when a new version of an application is

RE: Unaligned checkpoint waiting in 'start delay' with AsyncDataStream

2022-07-04 Thread Nathan Sharp
Thank you for trying it out! Hopefully, there is just some setting that needs to be changed. I have an Ubuntu VM where I created a single node Docker swarm. Then I used the following command to run Flink 1.15.0 using the docker-compose.yml file in the repository: docker stack up -c docker-com

Re: Alternate Forms of Deserialization in Flink SQL CLI

2022-07-04 Thread Martijn Visser
Hi Eric, It would basically mean implementing Protobuf as is now done for AVRO or JSON. There is a pull request on adding Protobuf support currently being reviewed, hopefully that will make it in Flink 1.16 [1]. You could consider trying that out of course. Best regards, Martijn [1] https://git

Re: Does Flink 1.14 support comsume Kafka 0.9?

2022-07-04 Thread Martijn Visser
Hi, You can't use the released Flink 1.14 to connect to a Kafka 0.9 cluster. If you want to do it yourself, you would have to change the Flink source code to incorporate the downgrade and all the API changes from the currently used Kafka Client version (v2.4.1) [1] to a version that is still compa

Re: Unaligned checkpoint waiting in 'start delay' with AsyncDataStream

2022-07-04 Thread Chesnay Schepler
I ran your code in the IDE and it worked just fine; checkpoints are being completed and results are printed to the console. Can you expand on how you run the job? On 02/07/2022 00:26, Nathan Sharp wrote: I am attempting to use unaligned checkpointing with AsyncDataStream, but the checkpoints

Re: How to mock new DataSource/Sink

2022-07-04 Thread Chesnay Schepler
It is indeed not easy to mock sources/sink with the new interfaces. There is an effort to make this easier for sources in the future (FLIP-238 ). For the time being I'd stick with the

How to mock new DataSource/Sink

2022-07-04 Thread David Jost
Hi, we are currently looking at replacing our sinks and sources with the respective counterparts using the 'new' data source/sink API (mainly Kafka). What holds us back is that we are not sure how to test the pipeline with mocked sources/sinks. Up till now, we somewhat followed the 'Testing doc