Hi,
if some can be interested, restoreState was removed from issue
https://issues.apache.org/jira/browse/FLINK-4196 and snapshotOperatorState
was replace from snapshotState in 1.2
--Filippo
2018-03-12 15:37 GMT+01:00 Filippo Balicchia :
> Hi,
>
> I'm newbie in Flink and in streaming Engine and
Hi,
Still stuck around this.
My understanding is, this is something Flink can't handle. If the batch-size of
Kafka Producer is non zero(which ideally should be), there will be in-memory
records and data loss(boundary cases). Only way I can handle this with Flink is
my checkpointing interval, w
Thank a lot Timo!
Best,
Max
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hi,
I am using flink sql in my application. It simply reads records from kafka
source, converts to table, then runs an query to have over window aggregation
for each record. Time lag watermark assigner with 10ms time lag is used.
The performance is not ideal. the end-to-end latency, which is th
Hi,
I'm afraid Flink does currently not support changing the schema of state when
restoring from a savepoint.
Best,
Aljoscha
> On 13. Mar 2018, at 07:36, kla wrote:
>
> Hi guys,
>
> I have the flink streaming job running (1.2.0 version) which has the
> following state:
>
> private transient
Hi Seth,
Thanks for sharing how you resolved the problem!
The problem might have been related to Flink's key groups which are used to
assign key ranges to tasks.
Not sure why this would be related to ZooKeeper being in a bad state. Maybe
Stefan (in CC) has an idea about the cause.
Also, it would
Hi,
Chesnay is right.
SQL and Table API do not support early window results and no allowed
lateness to update results with late arriving data.
If you need such features, you should use the DataStream API.
Best, Fabian
2018-03-13 12:10 GMT+01:00 Chesnay Schepler :
> I don't think you can specif
Does any know if this is a correct assumption
DataStream sorted = stream.keyBy("partition");
Will automattically put same record to the same sink thread ?
The behavior I am seeing is that a Sink setup with multiple threads is see data
from the same hour.
Any good examples of how to sort data so
Xiaochuan,
We are doing exactly as you described. We keep the injector as a global
static var.
But we extend from FlinkJobManager and FlinkTaskManager to override main
method and initialize the injector (and other things) during JVM startup,
which does cause tight code coupling. It is a little pa
Hi,
I'm evaluating Flink with the intent to integrate it into a Java project
that uses a lot of dependency injection via Guice. What would be the best
way to work with DI/Guice given that injected fields aren't Serializable?
I looked at this StackOverflow answer so far. To my understanding the
str
Hi Bill,
The size of the program depends on the number and complexity SQL queries
that you are submitting.
Each query might be translated into a sequence of multiple operators. Each
operator has a string with generated code that will be compiled on the
worker nodes. The size of the code depends on
Hi,
A Flink application does not have a problem if it ingests two streams with
very different throughput as long as they are somewhat synced on their
event-time.
This is typically the case when ingesting real-time data. In such
scenarios, an application would not buffer more data than necessary.
Hi guys,
I have the flink streaming job running (1.2.0 version) which has the
following state:
private transient ValueState>> userState;
With following configuration:
final ValueStateDescriptor>> descriptor =
new ValueStateDescriptor<>("userState",
TypeInformation.of(new UserTyp
Hi Gregory,
Your understanding is correct. It is not possible to assign UUID to the
operators generated by the SQL/Table planner.
To be honest, I am not sure whether the use case that you are describing
should be the scope of the "officially" supported use cases of the API.
It would require in dep
Hey all,
I'm using bucketing sink with a bucketer that creates partition per customer
per day.
I sink the files to s3.
it suppose to work on around 500 files at the same time (according to my
partitioning).
I have a critical problem of 'Too many open files'.
I've upload two taskmanagers, each with
It turns out the issue was due to our zookeeper installation being in a bad
state. I am not clear enough on flink’s networking internals to explain how
this manifested as a partition not found exception, but hopefully this can
serve as a starting point for other’s who run into the same issue.
[
I have looked into the CEP library. I have posted an issued on
stackoverflow.
https://stackoverflow.com/questions/49047879/global-windows-in-flink-using-custom-triggers-vs-flink-cep-pattern-api
However the pattern matches all possible solution on the stream of
events.Does pattern have a notion o
Hello,
You said that "data is distributed very badly across slots"; do you mean
that only a small number of subtasks is reading from HDFS, or that the
keyed data is only processed by a few subtasks?
Flink does prioritize date locality over date distribution when reading
the files, but the fu
I don't think you can specify custom triggers when using purer SQL, but
maybe Fabian or Timo know a SQL way of implementing your goal.
On 12.03.2018 13:16, 李玥 wrote:
Hi Chirag,
Thank for your reply!
I found a provided ContinuousEventTimeTrigger should be worked in my
situation.
Most examples a
Hello,
Event-time and watermarks can be used to deal with out-of-order events,
but since you're using global windows (opposed to time-based windows)
you have to implement the logic for doing this yourself.
Conceptually, what you would have to do is to not create your TripEv
when receiving a
Hello
I would like to know if flink supports any user level authentication like
username/password for flink web ui.
Regards
Sampath S
Hello,
yes, i think you'll need to use a ProcessFunction and clean up the state
manually.
On 11.03.2018 15:13, sundy wrote:
hi:
my streaming application always do Key by the some keys with event timestamp,
such as keyBy( “qps_1520777430”), so the expired keys(1 hours ago) are useless.
And
Hi ,all
I use the code below to set kafka JASS config, the serverConfig.jasspath is
/data/apps/spark/kafka_client_jaas.conf, but on flink standalone deployment,
it crashs. I am sure the kafka_client_jass.conf is valid, cause other
applications(Spark streaming) are still working fine with
23 matches
Mail list logo