Hi Vijayendra,
OutputFileConfig provides a builder method to create immutable objects
with given 'prefix' and 'suffix'. The parameter which you are passing
to '*withPartPrefix*' will only be evaluated at the time of calling
this method '*withPartPrefix*'. So if you want to achieve a dynamic
'prefi
- for deflateCodec
snappy - for snappyCodec
bzip2 - for bzip2Codec
xz - for xzCodec
Regards,
Ravi
On Thu, Jul 30, 2020 at 8:21 AM Ravi Bhushan Ratnakar <
ravibhushanratna...@gmail.com> wrote:
> If it is possible, please share the sample output file.
> Regards,
> Ravi
>
> On T
dec.SNAPPY)
stream:DataStream.writeUsingOutputFormat(aof)
Regards,
Ravi
On Wed, Jul 29, 2020 at 9:12 PM Ravi Bhushan Ratnakar <
ravibhushanratna...@gmail.com> wrote:
> Hi Vijayendra,
>
> Currently AvroWriters doesn't support compression. If you want to use
> compressi
Hi Vijayendra,
Currently AvroWriters doesn't support compression. If you want to use
compression then you need to have a custom implementation of AvroWriter
where you can add features of compression. Please find a sample
customization for AvroWriters where you could use compression. You can use
th
ue, Jul 28, 2020 at 1:20 PM Ravi Bhushan Ratnakar <
> ravibhushanratna...@gmail.com> wrote:
>
>> Hi Vijayendra,
>>
>> As far as rowFormat is concerned, it doesn't support compression.
>>
>>
>> Regards,
>> Ravi
>>
>> On Tue 28 Ju
Regards,
> Vijay
>
> On Tue, Jul 28, 2020 at 11:28 AM Ravi Bhushan Ratnakar <
> ravibhushanratna...@gmail.com> wrote:
>
>> Hi Vijayendra,
>>
>> You could achieve row encoded with like this as well
>>
>> codecName = "org.apache.had
Hi Vijayendra,
You could achieve row encoded with like this as well
codecName = "org.apache.hadoop.io.compress.GzipCodec"
val streamingFileSink:StreamingFileSink[String] =
StreamingFileSink.forBulkFormat(new
Path(outputPath),CompressWriters.forExtractor(new
DefaultExtractor[String]).withHadoopCo
er get a checkpointed
>> state. I guess this is okay since in production we won't have idle shards,
>> but it might be better to send through a empty record that doesn't get
>> emitted, but it does trigger a state update.
>>
>> -Steve
>>
>>
>&
ream.write('\n');
> }
>
> public void finish() throws IOException {
> compressedStream.flush();
> compressedStream.finish();
> }
>
> public void flush() throws IOException {
> compressedStream.flush();
> }
> }
>
>
> On Mon, Oct 21, 2019
Hi,
Seems like that you want to use "com.hadoop.compression.lzo.LzoCodec"
instead of "com.hadoop.compression.lzo.*LzopCodec*" in the below line.
compressedStream =
factory.getCodecByClassName("com.hadoop.compression.lzo.LzopCodec").createOutputStream(stream);
Regarding "lzop: unexpected end of
Hi,
As an alternative, you may use BucketingSink which provides you the
provision to customize suffix/prefix.
https://ci.apache.org/projects/flink/flink-docs-release-1.9/api/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.html
Regards,
Ravi
On Sat, Oct 19, 2019 at 3:54 AM a
Hi,
As per my understanding, Encoder's encode method is called for each and
every message and hence it is not logical to create compressor around given
output stream which will lead into unpredictable erroneous situation.
Encode responsibility is to encode the given object, not to compress. It
see
t exist in a savepoints? This seems like a big problem.
>
> -Steve
>
> On Wed, Oct 16, 2019 at 12:08 AM Ravi Bhushan Ratnakar <
> ravibhushanratna...@gmail.com> wrote:
>
>> Hi,
>>
>> I am also facing the same problem. I am using Flink 1.9.0 and consuming
>> from
Hi,
i was also experiencing with the similar behavior. I adopted following
approach
- used a distributed file system(in my case aws efs) and set the
attribute "web.upload.dir", this way both the job manager have same
location.
- on the load balancer side(aws elb), i used "readiness p
Hi,
I am also facing the same problem. I am using Flink 1.9.0 and consuming
from Kinesis source with retention of 1 day. I am observing that when the
job is submitted with "latest" initial stream position, the job starts well
and keep on processing data from all the shards for very long period of
herwise i am sure its always the problem whatever the kind of
> streaming engine you use. Tune your configuration to get the optimal rate
> so that flink checkpoint state is healthier.
>
> Regards
> Bhaskar
>
> On Tue, Sep 10, 2019 at 11:16 PM Ravi Bhushan Ratnakar <
> r
>> Ravi, have you looked at the io operation(iops) rate of the disk? You can
>> monitoring the iops performance and tune it accordingly with your work
>> load. This helped us in our project when we hit the wall tuning prototype
>> much all the parameters.
>>
>&g
cs-release-1.9/ops/state/state_backends.html#the-rocksdbstatebackend
>
>
> Thanks,
> Rafi
>
> On Sat, Sep 7, 2019, 17:47 Ravi Bhushan Ratnakar <
> ravibhushanratna...@gmail.com> wrote:
>
>> Hi All,
>>
>> I am writing a streaming application using Flink 1.9. This
Hi All,
I am writing a streaming application using Flink 1.9. This application
consumes data from kinesis stream which is basically avro payload.
Application is using KeyedProcessFunction to execute business logic on the
basis of correlation id using event time characteristics with below
configura
; https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kinesis.html#event-time-for-consumed-records
>
> Thanks,
> Rafi
>
>
> On Sat, Aug 3, 2019 at 8:23 PM Ravi Bhushan Ratnakar <
> ravibhushanratna...@gmail.com> wrote:
>
>> Hi All,
>>
Hi All,
I am designing a streaming pipeline using Flink 1.8.1, which consumes
messages from Kinesis and apply some business logic on per key basis using
KeyedProcessFunction and Checkpointing(HeapStateBackend). It is consuming
messages around 7GB per minutes from multiple Kinesis streams. I am usi
gt;hadoop-common) ? If so, could you please share your dependency versioning?
>>2. Does this use a kafka source with high flink parallelism (~400)
>>for all kafka partitions and does it run continuously for several days?
>>3. Could you please share your checkpoint interval c
I have done little changes in BucketingSink and implemented as new
CustomBucketingSink to use in my project which works fine with s3 and s3a
protocol. This implementation doesn't require xml file configuration,
rather than it uses configuration provided using flink configuration object
by calling
Hi Everybody,
Currently I am working on a project where i need to write a Flink Batch
Application which has to process hourly data around 400GB of compressed
sequence file. After processing, it has write it as compressed parquet
format in S3.
I have managed to write the application in Flink and a
24 matches
Mail list logo