Hello guys,
We have discovered minor issue with Flink 1.5 on YARN particularly which was
related with the way Flink manages temp paths (io.tmp.dirs
) in configuration:
https://ci.apache.org/projects/flink/flink-docs-master/ops/config.html#io-tmp-dirs
1. From what we can see in the code, de
Hi Chris,
MySQL (and maybe other DBMS as well) offers special syntax for upserts.
The answers to this SO question [1] recommend "INSERT INTO ... ON DUPLICATE
KEY UPDATE ..." or "REPLACE INTO ...".
However, AFAIK this syntax is not standardized and might vary from DBMS to
DBMS.
Best, Fabian
[1]
Hi Flink users,
I'm new to Flink and trying to evaluate couple of streaming frameworks via
implementing same apps.
While implementing apps with both Table API and SQL, I found there's 'no
watermark' presented in Flink UI, whereas I had been struggling to apply
row time attribute.
For example, be
There is also the SQL:2003 MERGE statement that can be used to implement
UPSERT logic.
It is a bit verbose but supported by Derby [1].
Best, Fabian
[1] https://issues.apache.org/jira/browse/DERBY-3155
2018-07-04 10:10 GMT+02:00 Fabian Hueske :
> Hi Chris,
>
> MySQL (and maybe other DBMS as well
Looking at the other threads, I assume you solved this issue.
The problem should have been that FlinkKafka09Consumer is not included in
the flink-connector-kafka-0.11 module, because it is the connector for
Kafka 0.9 and not Kafka 0.11.
Best, Fabian
2018-07-02 11:20 GMT+02:00 Mich Talebzadeh :
Sorry I forgot to mention the version: Flink 1.5.0, and I ran the app in
IntelliJ, not tried from cluster.
2018년 7월 4일 (수) 오후 5:15, Jungtaek Lim 님이 작성:
> Hi Flink users,
>
> I'm new to Flink and trying to evaluate couple of streaming frameworks via
> implementing same apps.
>
> While implementing
Hi Ashish,
I think we don't want to make it an official public API (at least not at
this point), but maybe you can dig into the internal API and leverage it
for your use case.
I'm not 100% sure about all the implications, that's why I pulled in Stefan
in this thread.
Best, Fabian
2018-07-02 15:3
Hi Elias,
I agree, the docs lack a coherent discussion of event time features.
Thank you for this write up!
I just skimmed your document and will provide more detailed feedback later.
It would be great to add such a page to the documentation.
Best, Fabian
2018-07-03 3:07 GMT+02:00 Elias Levy :
Hi,
The Evictor is useful if you want to remove some elements from the window
state but not all.
This also implies that a window is evaluated multiple times because
otherwise you could just filter in the the user function (as you suggested)
and purge the whole window afterwards.
Evictors are commo
The watermark display in the UI is bugged in 1.5.0.
It is fixed on master and the release-1.5 branch, and will be included
in 1.5.1 that is slated to be released next week.
On 04.07.2018 10:22, Jungtaek Lim wrote:
Sorry I forgot to mention the version: Flink 1.5.0, and I ran the app
in Intell
Hi Amol,
The memory consumption depends on the query/operation that you are doing.
Time-based operations like group-window-aggregations,
over-window-aggregations, or window-joins can automatically clean up their
state once data is not no longer needed.
Operations such as non-windowed aggregations
It's not really path-parsing logic, but path handling i suppose; see
RocksDBStateBackend#setDbStoragePaths().
I went ahead and converted said method into a simple test method, maybe
this is enough to debug the issue.
I assume this regression was caused by FLINK-6557, which refactored the
sta
Hi Jungtaek,
Flink 1.5.0 features a TableSource for Kafka and JSON [1], incl. timestamp
& watemark generation [2].
It would be great if you could let us know, if that addresses your use case
and if not what's missing or not working.
So far Table API / SQL does not have support for late-data side
Reference: https://issues.apache.org/jira/browse/FLINK-9739
On 04.07.2018 10:46, Chesnay Schepler wrote:
It's not really path-parsing logic, but path handling i suppose; see
RocksDBStateBackend#setDbStoragePaths().
I went ahead and converted said method into a simple test method,
maybe this i
yes indeed thanks. It is all working fine.
But only writing to a text file. I want to emulate what I do with Flink as
I do with Spark streaming writing high value events to Hbase on HDFS.
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOAB
Thanks Chesnay! Great news to hear. I'll try out with latest master branch.
Thanks Fabian for providing the docs!
I guess I already tried out with KafkaJsonTableSource and failed back to
custom TableSource since the type of rowtime field is string unfortunately,
and I needed to parse and map to n
Hi Fabian,
We did need a consistent view of data, we need the Counter and HDFS file to
be consistent. For example, when the Counter indicate there is 1000 message
wrote to the HDFS, there must be exactly 1000 messages in HDFS ready for
read.
The data we write to HDFS is collected by an Agent(whic
State is maintained in the configured state backend [1].
Best, Fabian
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/state_backends.html
2018-07-04 11:38 GMT+02:00 Amol S - iProgrammer :
> Hello fabian,
>
> Thanks for your quick response,
>
> According to above conver
Hi Ahmad,
Some tricks that might help to bring down the effort per tenant if you run
one job per tenant (or key per tenant):
- Pre-aggregate records in a 5 minute Tumbling window. However,
pre-aggregation does not work for FoldFunctions.
- Implement the window as a custom ProcessFunction that mai
Hi Fabian,
One job per tenant model soon becomes hard to maintain. For example 1000
tenants would require 1000 Flink and providing HA and resilience for 1000
jobs is not so trivial solution.
This is why we are hoping to get single flink job handling all the tenants
through keyby tenant. However t
Would it be feasible for you to partition your tenants across jobs, like
for example 100 customers per job?
On 04.07.2018 12:53, Ahmad Hassan wrote:
Hi Fabian,
One job per tenant model soon becomes hard to maintain. For example
1000 tenants would require 1000 Flink and providing HA and resili
Hi Jungtaek,
If it is "only" about the missing support to parse a string as timestamp,
you could also implement a custom TimestampExtractor that works similar to
the ExistingField extractor [1].
You would need to adjust a few things and use the expression
"Cast(Cast('tsString, SqlTimeTypeInfo.TIME
Hi Oleksandr,
I think your conclusions are correct. Thank you for looking into it. You can
open a JIRA ticket describing the issue.
Best,
Gary
On Wed, Jul 4, 2018 at 9:30 AM, Oleksandr Nitavskyi
wrote:
> Hello guys,
>
> We have discovered minor issue with Flink 1.5 on YARN particularly which
>
Hi Xilang,
I thought about this again.
The bucketing sink would need to roll on event-time intervals (similar to
the current processing time rolling) which are triggered by watermarks in
order to support consistency.
However, it would also need to maintain a write ahead log of all received
rows an
Hi:
I want to set some configs in the source functions use
getRuntimeContext().getExecutionConfig().setGlobalJobParameters(parameterTool)
And used the configs in the downstream operators such as filter function
through the getGlobalJobParameters, But it returns null pointer exception.Is
Hi,
In spark one can handle back pressure by setting the spark conf parameter:
sparkConf.set("spark.streaming.backpressure.enabled","true")
With backpressure you make Spark Streaming application stable, i.e.
receives data only as fast as it can process it. In general one needs to
ensure that you
Thanks again Fabian for providing nice suggestion!
Finally I got it working with applying your suggestion. Couple of tricks
was needed:
1. I had to apply a hack (create new TimestampExtractor class to package
org.apache.flink.blabla...) since Expression.resultType is defined as
"package private"
Hi,
Glad you could get it to work! That's great :-)
Regarding you comments:
1) Yes, I think we should make resultType() public. Please open a Jira
issue and describe your use case.
Btw. would you like to contribute your TimestampExtractor to Flink (or even
a more generic one that allows to confi
Thanks Fabian, filed FLINK-9742 [1].
I'll submit a PR for FLINK-8094 via providing my TimestampExtractor. The
implementation is also described as FLINK-9742. I'll start with current
implementation which just leverages automatic cast from STRING to
SQL_TIMESTAMP, but we could improve it from PR. Fe
Thanks - I will wait for Stefan’s comments before I start digging in.
> On Jul 4, 2018, at 4:24 AM, Fabian Hueske wrote:
>
> Hi Ashish,
>
> I think we don't want to make it an official public API (at least not at this
> point), but maybe you can dig into the internal API and leverage it for yo
Hi zhen,
Global configs can not be passed like this. You can set the global configs
through ExecutionConfig, more details here[1].
[1]
https://ci.apache.org/projects/flink/flink-docs-master/dev/best_practices.html#register-the-parameters-globally
On Wed, Jul 4, 2018 at 8:27 PM, zhen li wrote:
Hi all,
I am working on a prototype which should include Flink in a reactive
systems software. The easiest use-case with a traditional bank system where
I have one microservice for transactions and another one for
account/balances.
Both are connected with Kafka.
Transactions record a transaction
Hi all!
A late follow-up with some thoughts:
In principle, all these are good suggestions and are on the roadmap. We are
trying to make the release "by time", meaning for it at a certain date
(roughly in two weeks) and take what features are ready into the release.
Looking at the status of the f
Looks interesting.
As I understand you have a microservice based on ingestion where a topic is
defined for streaming messages that include transactional data. These
transactions should already exist in your DB. For now we look at DB as part
of your microservices and we take a logical view of it.
I think it is a little bit overkill to use Flink for such a simple system.
> On 4. Jul 2018, at 18:55, Yersinia Ruckeri wrote:
>
> Hi all,
>
> I am working on a prototype which should include Flink in a reactive systems
> software. The easiest use-case with a traditional bank system where I ha
35 matches
Mail list logo