Hi Chris,
FileInputFormat automatically takes cares of file decompression for the
files with gzip, xz, bz2 and deflate extensions.
--
Thanks,
Amit
Source:
https://github.com/apache/flink/blob/7b040b915504e59243c642b1f4a84c956d96d134/flink-core/src/main/java/org/apache/flink/api/common/io/FileInp
Hi Harshith,
Did you enable delete permission on S3 for running machines? Are you using
IAM roles or access key id and secret access key combo?
--
Thanks,
Amit
On Mon, Oct 15, 2018 at 3:15 PM Kumar Bolar, Harshith
wrote:
> Hi all,
>
>
>
> We store Flink checkpoints in Amazon S3. Flink periodic
Hi,
2) You may also want to look into ParameterTool[1] class to parse and read
passed properties file [2].
--
Thanks,
Amit
[1]
https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/java/utils/ParameterTool.html
[2]
https://ci.apache.org/projects/flink/flink-docs-re
Hi Marke,
Stacktrace suggests it is more of a Redis connection issue rather than
something with Flink. Could you share JedisPool configuration of Redis
sink? Are you writing into Redis in continuity or some bulk logic? Looks
like Redis connections are getting timeout here.
--
Thanks,
Amit
On Mon
Hi Nicos,
DatabaseClient is an example class to describe the asyncio concept. There
is no interface/class for this client in Flink codebase. You can use any
mariaDB client implementation which supports concurrent request to DB.
--
Cheers,
Amit
On Wed, Oct 3, 2018 at 8:14 PM Nicos Maris wrote:
Hi Julio,
What's the Flink version for this setup?
--
Thanks,
Amit
On Wed, Oct 3, 2018 at 4:22 PM Andrey Zagrebin
wrote:
> Hi Julio,
>
> Looks like some problem with dependencies.
> Have you followed the recommended s3 configuration guide [1]?
> Is it correct that your job already created chec
Hi Gravit,
I think Till is interested to know about classpath details present at the
start of JM and TM logs e.g. following logs provide classpath details used
by TM in our case.
2018-06-17 19:01:30,656 INFO org.apache.flink.yarn.YarnTaskExecutorRunner
-
Hi Sandybayev,
In the current state, Flink does not provide a solution to the
mentioned use case. However, there is open FLIP[1] [2] which has been
created to address the same.
I can see in your current approach, you are not able to update the
rule set data. I think you can update rule set data b
Hi Hao,
Have look over
https://issues.apache.org/jira/browse/HADOOP-13811?focusedCommentId=15703276&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15703276
What version of Hadoop are you using? Could you provide classpath used
by Flink Job Manager, it is present in
Thanks Till. `taskmanager.network.request-backoff.max` option helped in my
case. We tried this on 1.5.0 and jobs are running fine.
--
Thanks
Amit
On Thu 24 May, 2018, 4:58 PM Amit Jain, wrote:
> Thanks! Till. I'll give a try on your suggestions and update the thread.
>
> On Wed
from the BlobServer, Flink will first try to download them from the
> `high-availability-storageDir`
>
> Let me know if this solves your problem.
>
> Cheers,
> Till
>
> On Tue, May 22, 2018 at 1:29 PM, Amit Jain wrote:
>>
>> Hi Nico,
>>
>> Please find the
Hi,
Could you share log of job and impacted task manager? How much memory
you have allocated to the Job Manager?
--
Thanks,
Amit
On Mon, May 21, 2018 at 8:46 PM, sohimankotia wrote:
> Hi,
>
> I am running flink batch job .
>
> My job is running fine if i use 4 task manger and 8 slots = 32 parall
Hi Alex,
StreamingExecutionEnvironment#readFile is a helper function to create
file reader data streaming source. It uses
ContinuousFileReaderOperator and ContinuousFileMonitoringFunction
internally.
As both file reader operator and monitoring function uses
checkpointing so is readFile [1], you c
0.n4.nabble.com/sink-with-BucketingSink-to-S3-files-override-td18433.html
> [2] https://issues.apache.org/jira/browse/FLINK-6306
>
> On Thu, May 17, 2018 at 11:57 AM, Amit Jain wrote:
>>
>> Hi,
>>
>> We are using Flink to process click stream data from Kafka and pus
nce semantics"[2]
>
> Thanks,
> Rong
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/dev/connectors/filesystem_sink.html
> [2]
> https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/streaming/connectors/fs/bucketing/BucketingSink.h
Hi,
We are using Flink to process click stream data from Kafka and pushing
the same in 128MB file in S3.
What is the message processing guarantees with S3 sink? In my
understanding, S3A client buffers the data on memory/disk. In failure
scenario on particular node, TM would not trigger Writer#clo
Thanks Chesnay for the quick reply. I raised the ticket
https://issues.apache.org/jira/browse/FLINK-9381.
On Wed, May 16, 2018 at 5:33 PM, Chesnay Schepler wrote:
> Please open a JIRA.
>
>
> On 16.05.2018 13:58, Amit Jain wrote:
>>
>> Hi,
>>
>> We are running
Hi,
We are running Flink 1.5.0 rc3 with YARN as cluster manager and found
Job Manager is getting killed due to out of disk error.
Upon further analysis, we found blob server data for a job is not
getting cleaned up. Right now, we wrote directory cleanup script based
on directory creation time of
Hi Flavio,
Which version of Flink are you using?
--
Thanks,
Amit
On Fri, May 4, 2018 at 6:14 PM, Flavio Pompermaier wrote:
> Hi all,
> I've a Flink batch job that reads a parquet dataset and then applies 2
> flatMap to it (see pseudocode below).
> The problem is that this dataset is quite big a
(maybe we should not print it so prominently to not confuse users). There
> must be something else happening on the JobManager. Can you share the JM
> logs as well?
>
> Thanks a lot,
> Stephan
>
>
> On Wed, May 2, 2018 at 12:21 PM, Amit Jain wrote:
>>
>> Thanks! Fabian
&
ter your commit.
>
> Do you have a chance to build the current release-1.5 branch and check if
> the fix also resolves your problem?
>
> Otherwise it would be great if you could open a blocker issue for the 1.5
> release to ensure that this is fixed.
>
> Thanks,
> Fabia
Cluster is running on commit 2af481a
On Sun, Apr 29, 2018 at 9:59 PM, Amit Jain wrote:
> Hi,
>
> We are running numbers of batch jobs in Flink 1.5 cluster and few of those
> are getting stuck at random. These jobs having the following failure after
> which operator status changes
Hi,
We are running numbers of batch jobs in Flink 1.5 cluster and few of those
are getting stuck at random. These jobs having the following failure after
which operator status changes to CANCELED and stuck to same.
Please find complete TM's log at
https://gist.github.com/imamitjain/066d0e0ee2
Hi Gary,
This setting has resolved the issue. Does it increase timeout for all the
RPC or specific components?
We had following settings in Flink 1.3.2 and they did the job for us.
akka.watch.heartbeat.pause: 600 s
akka.client.timeout: 5 min
akka.ask.timeout: 120 s
--
Thanks,
Amit
Hi Gary,
We found the underlying issue with the following problem.
Few of our jobs are stuck with logs [1], these jobs are only able to
allocate JM and couldn't get any TM, however, there are ample resource on
our cluster.
We are running ETL merge job here. In this job, we first find new deltas
a
Hi,
KeyBy operation partition the data on given key and make sure same slot will
get all future data belonging to same key. In default implementation, it can
also map subset of keys in your DataStream to same slot.
Assuming you have number of keys equal to number running slot then you may
specif
+user@flink.apache.org
On Wed, Apr 4, 2018 at 11:33 AM, Amit Jain wrote:
> Hi,
>
> We are hitting TaskManager deadlock on NetworkBufferPool bug in Flink 1.3.2.
> We have set of ETL's merge jobs for a number of tables and stuck with above
> issue randomly daily.
>
> I
27 matches
Mail list logo