Hello,
I have a table T0 with the following schema -
root
|-- amount: BIGINT
|-- timestamp: TIMESTAMP(3)
The following two queries fail execution on the above table when executed
in streaming mode using the Blink planner.
WITH A AS (
SELECT COUNT(*) AS ct, tumble_end(`timestamp`,
Great to hear it works!
`setStartFromGroupOffset` [1] will start reading partitions from the
consumer group’s (group.id setting in the consumer properties) committed
offsets in Kafka brokers. If offsets could not be found for a partition,
the 'auto.offset.reset' setting in the properties will be u
Hello,
I'm running a Job on AWS EMR with the TableAPI that does a long series of
Joins, GroupBys, and Aggregates and I'd like to know how to best tune
parallelism.
In my case, I have 8 EMR core nodes setup each with 4vCores and 8Gib of
memory. There's a job we have to run that has ~30 table opera
I don't think the first option has any benefit.
On 11/5/2020 1:19 AM, Alexey Trenikhun wrote:
Hello,
I have two Kafka topics ("A" and "B") which provide similar structure
wise data but with different load pattern, for example hundreds
records per second in first topic while 10 records per sec
It would be good if you could elaborate a bit more on your use-case.
Are you using batch or streaming? What kind of "message" are we talking
about? Why are you thinking of using a static variable, instead of just
treating this message as part of the data(set/stream)?
On 11/5/2020 12:55 PM, Si-
It is unlikely that this is a port issue, and I would currently suspect
that something in your SSL setup is not correct.
@Nico: do you have a suggestion on how to debug this?
On 11/5/2020 4:23 PM, Patrick Eifler wrote:
Hi,
I did set up a flink session cluster on K8s.
Now I added the ssl conf
I'd go with the network congestion theory for the time being; then the
only remedy is throttling the download of said list, or somehow reducing
the size of it significantly
.
What the task thread is doing doesn't matter in regards to HA; it may
cause checkpoints to time out, but should have no
Hi there,
I have a stream where I reload a huge list from time to time. I know there are
various Side-Input patterns, but none of them seem to be perfect so I stuck
with an easy approach: I use a Guava Cache and if it expires and a new element
comes in, processing of the element is blocked up
Hi everybody,
I was trying to use the JobListener in my job but onJobExecuted() on Flink
1.11.0 but I can't understand if the job succeeded or not.
If I look at the Javadoc of the JobListener.onJobExecute() [1] says
"Callback on job execution finished, successfully or unsuccessfully"
but I can't fi
Yes,
kafka connect supports topics.regex option for Sink connectors. The
connector automatically discovers new topics which fit the regex pattern.
It's similar with source connectors, which discover tables in a SQL
database and save them to Kafka topics.
czw., 5 lis 2020 o 04:16 Jark Wu napisał(
Also, just to be clear our ES connector looks like this:
CREATE TABLE sink_es_groups (
id BIGINT,
//.. a bunch of scalar fields
array_of_ids ARRAY,
PRIMARY KEY (id) NOT ENFORCED
) WITH (
'connector' = 'elasticsearch-7',
'hosts' = '${env:ELASTICSEARCH_HOSTS}',
'index' = '${env:GROUPS_ES_INDEX}',
'f
Hello,
I'm using the Table API to do a bunch of stateful transformations on CDC
Debezium rows and then insert final documents into Elasticsearch via the ES
connector.
I've noticed that Elasticsearch is constantly deleting and then inserting
documents as they update. Ideally, there would be no del
Hi Igal,
thanks for these pointers!
I currently deploy a flink jar per docker copy. But this is a spike
setup anyway. I will now discard it and switch directly to working in
kubernetes.
So, just so I understand this right, the recommended production setup
would be:
* Build a docker image
Thanks Timo,
Checking if an element is in an Array does seem like a very useful function
to have. Is there any plan to add it?
Thanks
On Thu, Nov 5, 2020 at 7:26 AM Timo Walther wrote:
> Hi Rex,
>
> as far as I know, the IN operator only works on tables or a list of
> literals where the latter
Thanks you for the response. We do see the heap actually shrink after
starting new jobs.
From: Matthias Pohl
Sent: Thursday, November 5, 2020 8:20 AM
To: Colletta, Edward
Cc: user@flink.apache.org
Subject: Re: a couple of memory questions
This email is from an external source - exercise cau
How do you deploy the job currently?
Are you using the data stream integration / or as a Flink Jar [1]
(also please note, that the directories might be created but without
checkpoint interval set, they will be empty)
Regarding your two questions:
That is true that you can theoretically share the
Hi Igal,
thanks for your quick and detailed reply! For me, this is the really
great defining feature of Stateful Functions: Separating
StreamProcessing "Infrastructure" from Business Logic Code, possibly
maintained by a different team.
Regarding your points: I did add the checkpoint interval
Hi Jan,
The architecture outlined by you, sounds good and we've run successfully
mixed architectures like this.
Let me try to address your questions:
1)
To enable checkpointing you need to set the relevant values in your
flink-conf.yaml file.
execution.checkpointing.interval: (see [1])
state.che
Hi Rex,
as far as I know, the IN operator only works on tables or a list of
literals where the latter one is just a shortcut for multiple OR
operations. I would just go with a UDF for this case. In SQL you could
do an UNNEST to convert the array into a table and then use the IN
operator. But
Hi,
I did set up a flink session cluster on K8s.
Now I added the ssl configuration as shown in the documentation:
# Flink TLS
security.ssl.internal.enabled: true
security.ssl.internal.keystore: /config/internal-keystore/internal.keystore.jks
security.ssl.internal.truststore:
/config/internal-k
No, because that would break the API and any log-parsing infrastructure
relying on it.
On 11/5/2020 2:56 PM, Flavio Pompermaier wrote:
Just another question: should I open a JIRA to rename
ExecutionState.CANCELING to CANCELLING (indeed the enum's Javadoc
report CANCELLING)?
On Thu, Nov 5, 20
Hi,
I'm currently trying to set up a Flink Stateful Functions Job with the
following architecture:
* Kinesis Ingress (embedded)
* Stateful Function (embedded) that calls to and takes responses from an
external business logic function (python worker similar to the one in
the python greeter e
Hi,
everything prefixed with `org.apache.flink.table.planner` is Blink
planner. So you should be able to use those testing classes. The Blink
planner is also the default one since 1.11. In general, I would
recommend to look a bit into the testing package. There are many
different testing exam
Just another question: should I open a JIRA to rename
ExecutionState.CANCELING to CANCELLING (indeed the enum's Javadoc report
CANCELLING)?
On Thu, Nov 5, 2020 at 11:31 AM Chesnay Schepler wrote:
> The "mismatch" is due to you mixing job and vertex states.
>
> These are the states a job can be i
Hello Edward,
please find my answers within your message below:
On Wed, Nov 4, 2020 at 1:35 PM Colletta, Edward
wrote:
> Using Flink 1.9.2 with FsStateBackend, Session cluster.
>
>
>
>1. Does heap state get cleaned up when a job is cancelled?
>
> We have jobs that we run on a daily basis. W
Hi Arvid Heise,
Thank you for your reply.
Yes,my connection to the JM is bad !!!
Best wishes,forideal
At 2020-11-04 15:32:38, "Arvid Heise" wrote:
A jar upload shouldn't take minutes. There are two possibilities that likely
co-occured:
- your jar is much bigger than neede
Yes, Calcite uses apiguardian. To answer your question Aljoscha, no, I do
not use it directly.
It's a dependency of the shaded Calcite version inside the blink JAR.
On Thu, Nov 5, 2020 at 11:02 AM Timo Walther wrote:
> Hi Yuval,
>
> this error is indeed weird.
>
> @Aljoscha: I think Calcite use
Currently I use Flink 1.9.1. The actual thing I want to do is send some
messages from downstream operators to upstream operators, which I consider
use static variable.
But it makes me have to make sure in one taskmanager process it always has
these two operators, can I use CoLocationGroup to solve
Ok I understood. Unfortunately the documentation is not able to extract the
Map type of status-count that is Map and I
thought that the job status and execution status were equivalent.
And what about the heuristic...? Could it make sense
On Thu, Nov 5, 2020 at 11:33 AM Chesnay Schepler wrote:
>
Admittedly, it can be out-of-sync if someone forgets to regenerate the
documentation, but they cannot be mixed up.
On 11/5/2020 11:31 AM, Chesnay Schepler wrote:
|The "mismatch" is due to you mixing job and vertex states.
|
|These are the states a job can be in (based on
org.apache.flink.api.
|The "mismatch" is due to you mixing job and vertex states.
|
|These are the states a job can be in (based on
org.apache.flink.api.common.JobStatus):|
|[ "CREATED", "RUNNING", "FAILING", "FAILED", "CANCELLING",
"CANCELED", "FINISHED", "RESTARTING", "SUSPENDED", "RECONCILING" ]||
|
|T
What do you thinkin about this very rough heuristic (obviously it makes
sense only for batch jobs)?
It's far from perfect but at least it gives an idea of something going on..
PS: I found some mismatch from the states documented in [1] and the ones I
found in the ExecutionState enum..
[1]
https://c
It was planned for 1.12 but didn't make it. 1.13 should fully implement
FLIP-136. I just created issues to monitor the progress:
https://issues.apache.org/jira/browse/FLINK-19976
Regards,
Timo
On 04.11.20 18:43, Rex Fenley wrote:
Thank you for the info!
Is there a timetable for when the next
Hi Yuval,
this error is indeed weird.
@Aljoscha: I think Calcite uses apiguardian.
When I saw the initial error, it looked like there are different Apache
Calcite versions in the classpath. I'm wondering if this is a pure SBT
issue because I'm sure that other users would have reported this er
Yes.
There is also a Flink Forward session [1] (since 14:00) talked about
the internals of the underlying changelog mechanism with a visual example.
Best,
Jark
[1]:
https://www.youtube.com/watch?v=KDD8e4GE12w&list=PLDX4T_cnKjD0ngnBSU-bYGfgVv17MiwA7&index=48&t=820s
On Thu, 5 Nov 2020 at 15:48, H
35 matches
Mail list logo