When we submit a job which use udf of hive , the job will dependent on udf's
jars and configuration files.
We have already store udf's jars and configuration files in hive metadata
store,so we excpet that flink could get those files hdfs paths by
hive-connector,and get those files in hdfs by paths
Hi!
I want to join two tables and write the results to Avro where the left and
right rows are nested in the avro output. Is it possible to do this with
the SQL interface?
Thanks!
- Dan
CREATE TABLE `flat_avro` (
`left` ROW,
`right` ROW
) WITH (
'connector' = 'filesystem',
'path' =
Hi,
This is in continuation to an already raised request, (had replied to the
same thread but couldn't get any response yet, hence posting a new request)
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/The-rpc-invocation-size-exceeds-the-maximum-akka-framesize-when-the-job-was-
If your job crashes, Flink will automatically restart from the latest
checkpoint, without any manual intervention. JobManager HA is only needed
for automatic recovery after the failure of the Job Manager.
You only need externalized checkpoints and "-s :checkpointPath" if you want
to use checkpoint
Hi,
I am trying to calculate some different metrics using the state backend to
control if events have been seen before. I am using the
ProcessWindowFunction, but nothing gets through, it is as if the
.process-function is ignored. Is it not possible to store a custom case
class as ValueState? Or do
how can we know the expected size for which it is failing?
If you did not configure akka.framesize yourself then it is set to the
documented default value. See the configuration documentation for the
release you are using.
> Does the operator state have any impact on the expected Akka frame
Great! Very thanks @ZhuZhu for driving this and thanks for all contributed to
the release!
Best,
Yun
--Original Mail --
Sender:Jingsong Li
Send Date:Thu Sep 17 13:31:41 2020
Recipients:user-zh
CC:dev , user , Apache Announce
List
Subject:Re: [ANNOUNCE] Apac
Thanks Zhuzhu for driving the release!!!
Best,
Guowei
On Fri, Sep 18, 2020 at 5:10 PM Yun Gao wrote:
> Great! Very thanks @ZhuZhu for driving this and thanks for all contributed
> to the release!
>
> Best,
> Yun
>
> --Original Mail --
> *Sender:*Jingsong Li
>
Hi community,
I am trying to deploy the default WordCount stream application on
minikube using the official documentation at [1]. I am using minikube
v1.13.0 on Ubuntu 18.04 and Kubernetes v1.19.0 on Docker 19.03.8. I
could sucessfully start 1 job manager and 3 task managers using the
yaml files f
Thanks for the quick response.
I might have wrongly phrased one of the questions.
/"> how can we know the expected size for which it is failing?
If you did not configure akka.framesize yourself then it is set to the
documented default value. See the configuration documentation for the
release y
If you use 1.10.0 or above the framesize for which it failed is part of
the exception message, see FLINK-14618.
If you are using older version, then I'm afraid there is no way to tell.
On 9/18/2020 12:11 PM, shravan wrote:
Thanks for the quick response.
I might have wrongly phrased one of the
Hi Martin,
I am not sure what is the exact problem. Is it that the ProcessFunction
is not invoked or is the problem with values in your state?
As for the question of the case class and ValueState. The best way to do
it, is to provide the TypeInformation explicitly. If you do not provide
the TypeI
Thanks again for the quick response.
In that case, could you tell me what are the possible factors that warrant a
framesize increase? I see the official documentation and it simply states
"If Flink fails because messages exceed this limit, then you should increase
it", which isn't very convincing.
On 14.09.20 02:20, Steven Wu wrote:
Right now, we use UnionState to store the `nextCheckpointId` in the Iceberg
sink use case, because we can't retrieve the checkpointId from
the FunctionInitializationContext during the restore case. But we can move
away from it if the restore context provides th
There are quite a few reason why the framesize could be exceeded.
The most common one we see is due to the parallelism being so high that
tasks can't be deployed in the first place. When a task is deployed the
RPC payload also contains information about all downstream tasks this
task sends dat
Hi Dawid,
Thanks for your reply, much appreciated.
I tried using your implementation for TypeInformation, but still nothing
gets through. There are no errors either, but it simply runs without
sending data to the sink. I have checked that there is data in the input
topic, and I have used the code
Another note, the case class in hand has about 40 fields in it, is there a
maximum limit for the number of fields?
best regards
Den fre. 18. sep. 2020 kl. 13.05 skrev Martin Frank Hansen <
m...@berlingskemedia.dk>:
> Hi Dawid,
>
> Thanks for your reply, much appreciated.
>
> I tried using your i
Thanks for bringing this up Juha, and good catch.
We actually are disabling WAL for routine writes by default when using
RocksDB and never encountered segment fault issues. However, from history
in FLINK-8922, segment fault issue occurs during restore if WAL is
disabled, so I guess the root cause
Thanks David, in case of manual restart; to get checkpoint path
programmatically I'm using the following code to retrieve JobId and
CheckpointID so i could pass along while restarting with "-s" but seems I'm
missing something as I'm getting empty TimestampedFileSplit array.
GlobFilePathFilter file
Hello,
I upgraded from Flink 1.7.2 to 1.10.2. One of the jobs running on the task
managers is periodically crashing w/ the following error:
java.lang.OutOfMemoryError: Metaspace. The metaspace out-of-memory error
has occurred. This can mean two things: either the job requires a larger
size of JV
Thank you, Chesnay!
On Thu, Sep 17, 2020, 3:59 AM Chesnay Schepler wrote:
> By default metrics are only updated every 10 seconds; this can be
> controlled via
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/config.html#metrics-fetcher-update-interval
> .
>
> On 9/17/2020 12:
I have a few results that I want to produce.
- A join B
- A join B join C
- A join B join C join D
- A join B join C join D join E
When I use the DataSet API directly, I can execute all of these in the same
job to reduce redundancy. When I use the SQL interface, it looks like
separate jobs are cr
22 matches
Mail list logo