Flink S3 error

2019-10-30 Thread Harrison Xu
I'm seeing this exception with the S3 uploader - it claims a previously part file was not found. Full jobmanager logs attached. (Flink 1.8) java.io.FileNotFoundException: No such file or directory: s3a://qcache/tmp/kafka/meta/rq_features/dt=2019-10-30T15/partition_1/_part-4-1169_tmp_21400e5e-3921-

StreamingFileSink to S3 failure to complete multipart upload

2019-11-06 Thread Harrison Xu
Hello, I'm seeing the following behavior in StreamingFileSink (1.9.1) uploading to S3. 2019-11-06 15:50:58,081 INFO com.quora.dataInfra.s3connector.flink.filesystem.Buckets -* Subtask 1 checkpointing for checkpoint with id=5025 (max part counter=3406).* 2019-11-06 15:50:58,448 INFO org.apac

Re: StreamingFileSink to S3 failure to complete multipart upload

2019-11-06 Thread Harrison Xu
Harrison Xu wrote: > Hello, > I'm seeing the following behavior in StreamingFileSink (1.9.1) uploading > to S3. > > 2019-11-06 15:50:58,081 INFO > com.quora.dataInfra.s3connector.flink.filesystem.Buckets -* Subtask > 1 checkpointing for checkpoint with id=5025 (max p

Unable to retrieve Kafka consumer group offsets

2019-11-07 Thread Harrison Xu
I am using Flink 1.9.0 and KafkaConsumer010 (Kafka 0.10.1.1). Attempting to retrieve the offset lag of Flink kafka consumers results in the below error. I saw a separate thread about this in the mailing list in 2017 - is this not fixed? Are there workarounds? > $ /work/kafka_2.11-0.10.1.1/bin/kaf

Re: Unable to retrieve Kafka consumer group offsets

2019-11-08 Thread Harrison Xu
d are you referring to exactly? I've pulled in > Becket who might be able to tell you more about the Kafka connector. > > Cheers, > Till > > On Thu, Nov 7, 2019 at 11:11 PM Harrison Xu wrote: > >> I am using Flink 1.9.0 and KafkaConsumer010 (Kafka 0.10.1.1). Attempting >

Flink 1.9.1 KafkaConnector missing data (1M+ records)

2019-11-25 Thread Harrison Xu
Hello, We're seeing some strange behavior with flink's KafkaConnector010 (Kafka 0.10.1.1) arbitrarily skipping data. *Context* KafkaConnector010 is used as source, and StreamingFileSink/BulkPartWriter (S3) as sink with no intermediate operators. Recently, we noticed that millions of Kafka records

Re: Flink 1.9.1 KafkaConnector missing data (1M+ records)

2019-12-02 Thread Harrison Xu
ould help if you could share a > bit more information about your BucketAssigner. > How are these names assigned to the files and what does TT stand for? > Can it be that there are a lot of events for partition 4 that fill up > 2 part files for that duration? I am > asking because the