Hi Maminspapin,
I just answered another question similarly, so let me just c&p it here:
The beauty of Avro lies in having reader and writer schema and schema
compatibility, such that if your schema evolves over time (which will
happen in streaming naturally but is also very common in batch), you
Hi Sumeet,
The beauty of Avro lies in having reader and writer schema and schema
compatibility, such that if your schema evolves over time (which will
happen in streaming naturally but is also very common in batch), you can
still use your application as is without modification. For streaming, this
Hi Bin,
I would put Flink applications into separate repos. It reduces compile
times and makes automatic deployment much easier (if you update
master/release branch of application X, you simply deploy it - potentially
with some manual trigger in your CI/CD pipeline) . You can also easily bump
Flin
Hi Alex,
The easiest way to verify if what you tried is working out is to look at
Flink's Web UI and check the topology.
The broadcast side of the input will always be ... well broadcasted (=not
chained). So you need to disable chaining only on the non-broadcasted
dataset.
val parsed: DataStream
Hi Vijay,
if you don't specify a checkpoint, then Flink assumes you want to start
from scratch (e.g., you had a bug in your business logic and need to start
completely without state).
If there is any failure and Flink restarts automatically, it will always
pick up from the latest checkpoint [1].
Hello,
I've setup Flink as an Application Cluster in Kubernetes. Now I'm looking
into monitoring the Flink cluster in Datadog. This is what is configured
in the flink-conf.yaml to emit metrics:
metrics.scope.jm: flink.jobmanager
metrics.scope.jm.job: flink.jobmanager.job
metrics.scope.tm: flink
Hi,
The flink official document clearly states that "Simple mutual
authentication may be enabled by configuration if authentication of
connections to the REST endpoint is required, but we recommend to deploy a
“side car proxy”: Bind the REST endpoint to the loopback interface (or the
pod-local int
I have tried to add 'classloader.parent-first-patterns.additional:
"ru.yandex.clickhouse" ' to flink-config, but problem still exist.
Is there lightweight way to put clickhouse JDBC driver on Flink lib/ folder?
-- --
??:
My DDL is:
CREATE TABLE csv (
id BIGINT,
name STRING
) WITH (
'connector' = 'filesystem',
'path' = '.',
'format' = 'csv'
);
Best,
Kurt
On Fri, Apr 9, 2021 at 10:00 AM Kurt Young wrote:
> Hi Flavio,
>
> We would recommend you to use new table source & sin
Hi Till, I have 2 follow-ups.
(1) Why is Hive special, while for connectors such as kafka, the docs
suggest simply bundling the kafka connector dependency with my user code?
(2) it seems the document misses the "before you start the cluster" part -
does it always require a cluster restart wheneve
Hi Flavio,
We would recommend you to use new table source & sink interfaces, which
have different
property keys compared to the old ones, e.g. 'connector' v.s.
'connector.type'.
You can follow the 1.12 doc [1] to define your csv table, everything should
work just fine.
*Flink SQL> set table.dml-
Thanks Arvid! I'm not completely clear on where to apply your suggestions.
I've included a sketch of my job below, and I have a couple questions:
1. It looks like enableObjectReuse() is a global setting, should I worry
about whether I'm using any mutable data between stages?
2. Should I disableCh
Thanks it was working fine with: bin/flink run -s
s3://bucketxx-app/flink/checkpoint/iv/f1ef373e27203907d2d568bca31e6556/chk-384/
\
On Thu, Apr 8, 2021 at 11:42 AM Vijayendra Yadav
wrote:
> Hi Arvid,
>
> Thanks for your response. I did not restart from the checkpoint. I assumed
> Flink would lo
Hi,
Did you put the clickhouse JDBC driver on Flink main classpath (in lib
folder) and not in user-jar - as described here:
https://ci.apache.org/projects/flink/flink-docs-release-1.12/ops/debugging/debugging_classloading.html#unloading-of-dynamically-loaded-classes-in-user-code?
When we enco
Hi Till,
since I was using the same WITH-clause both for reading and writing I
discovered that overwrite is actually supported in the Sinks, while in the
Sources an exception is thrown (I was thinking that those properties were
simply ignored).
However the quote-character is not supported in the si
Hi,
don't know if this is the problem you're facing, but some time ago we
encountered two issues connected to REST API and increased disk usage
after each submission:
https://issues.apache.org/jira/browse/FLINK-21164
https://issues.apache.org/jira/browse/FLINK-9844
- they're closed ATM, but
Hi Arvid,
Thanks for your response. I did not restart from the checkpoint. I assumed
Flink would look for a checkpoint upon restart automatically.
*I should restart like below ?*
bin/flink run -s
s3://bucketxx-app/flink/checkpoint/iv/f1ef373e27203907d2d568bca31e6556/chk-384/
\
Thanks,
Vijay
O
Thank you, that makes sense.
On Thu, Apr 8, 2021 at 12:37 AM Timo Walther wrote:
> Hi Deepthi,
>
> 1. Correct
> 2. Correct
> 3. Incremental snapshots simply manage references to RocksDB's sstables.
> You can find a full explanation here [1]. Thus, the payload is a
> blackbox for Flink and Flink'
I have deployed my own flink setup in AWS ECS. One Service for JobManager
and one Service for task Managers. I am running one ECS task for a job
manager and 3 ecs tasks for TASK managers.
I have a kind of batch job which I upload using flink rest every-day with
changing new arguments, when I submi
Hi Yik San,
for future reference, I copy my answer from the SO here:
The reason for this difference is that for Hive it is recommended to start
the cluster with the respective Hive dependencies. The documentation [1]
states that it's best to put the dependencies into the lib directory before
you
Hi Flavio,
I tried to execute the code snippet you have provided and I could not
reproduce the problem.
Concretely I am running this code:
final EnvironmentSettings envSettings = EnvironmentSettings.newInstance()
.useBlinkPlanner()
.inStreamingMode()
.build();
final TableEnvironment
Thanks Till.
Hi Jark,
Any inputs, going through the code of 1.1 and 1.3 in the meantime.
Thanks,
Hemant
On Thu, Apr 8, 2021 at 3:52 PM Till Rohrmann wrote:
> Hi Hemant,
>
> I am pulling in Jark who is most familiar with Flink's cdc connector. He
> might also be able to tell whether the fix ca
As identified with the community, it's bug and more information in issue
https://issues.apache.org/jira/browse/FLINK-22113
On Sat, Apr 3, 2021 at 8:43 PM Kai Fu wrote:
> Hi team,
>
> We have a use case to join multiple data sources to generate a
> continuous updated view. We defined primary key
Hi Kevin,
when decreasing the TaskManager count I assume that you also decrease the
parallelism of the Flink job. There are three aspects which can then cause
a slower recovery.
1) Each Task gets a larger key range assigned. Therefore, each TaskManager
has to download more data in order to restar
Hi Hemant,
I am pulling in Jark who is most familiar with Flink's cdc connector. He
might also be able to tell whether the fix can be backported.
Cheers,
Till
On Thu, Apr 8, 2021 at 10:42 AM bat man wrote:
> Anyone who has faced similar issues with cdc with Postgres.
>
> I see the restart_lsn
IIUC, your program will finally generate 100 ChildFirstClassLoader in
a TM. But it should always be GC when job finished. So, as Arvid said,
you'd better check who is referencing those ChildFirstClassLoader.
Best,
Yangze Guo
On Thu, Apr 8, 2021 at 5:43 PM 太平洋 <495635...@qq.com> wrote:
>
> My app
My application program looks like this. Does this structure has some problem?
public class StreamingJob {
public static void main(String[] args) throws Exception {
int i = 0;
while (i < 100) {
try {
The question is cross-posted on Stack Overflow
https://stackoverflow.com/questions/67001326/why-does-flink-quickstart-scala-suggests-adding-connector-dependencies-in-the-de
.
## Connector dependencies should be in default scope
This is what [flink-quickstart-scala](
https://github.com/apache/flin
Any help here? Moreover if I use the DataStream APIs there's no left/right
outer join yet..are those meant to be added in Flink 1.13 or 1.14?
On Wed, Apr 7, 2021 at 12:27 PM Flavio Pompermaier
wrote:
> Hi to all,
> I'm testing writing to a CSV using Flink 1.13 and I get the following
> error:
>
Anyone who has faced similar issues with cdc with Postgres.
I see the restart_lsn and confirmed_flush_lsn constant since the snapshot
replication records were streamed even though I have tried inserting
a record in the whitelisted table.
select * from pg_replication_slots;
slot_name | plugin
Hi Vijay,
edit: After re-reading your message: are you sure that you restart from a
checkpoint/savepoint? If you just start the application anew and use LATEST
initial position, this is the expected bahvior.
--- original intended answer if you restart from checkpoint
this is definitively not the
Hi,
which Flink version are you using?
Could you also share the resulting plan with us using
`TableEnvironment.explainSql()`?
Thanks,
Timo
On 07.04.21 17:29, soumoks123 wrote:
I receive the following error when trying to use the LISTAGG function in
Table API.
java.lang.RuntimeException:
Hi Deepthi,
1. Correct
2. Correct
3. Incremental snapshots simply manage references to RocksDB's sstables.
You can find a full explanation here [1]. Thus, the payload is a
blackbox for Flink and Flink's compression flag has no impact. So we
fully rely what RocksDB offers.
4. Correct
I hope t
Hi,
can you check the content of the JAR file that you are submitting? There
should be a `META-INF/services` directory with a
`org.apache.flink.table.factories.Factory` file that should list the
Parque format.
See also here:
https://ci.apache.org/projects/flink/flink-docs-master/docs/connec
Hi,
I don't think there is a Flink specific answer to this question. Just do
what you would normally do with a normal Java application running inside a
JVM. If there is an OOM on heap space, you can either try to bump the heap
space, or reduce usage of it. The only Flink specific part is probably
35 matches
Mail list logo