Re: Apache Flink and serious streaming stateful processing

2015-10-15 Thread Gyula Fóra
t; In meantime, this is the performance that attracts me to RocksDb: > >> Measure performance to load 1B keys into the database. The keys are >> inserted in random order. >> rocksdb: 103 minutes, 80 MB/sec (total data size 481 GB, 1 billion >> key-values) >> Measure pe

Re:

2015-10-19 Thread Gyula Fóra
It's 0.10-SNAPSHOT Gyula Maximilian Michels ezt írta (időpont: 2015. okt. 19., H, 17:13): > I forgot to ask you: Which version of Flink are you using? 0.9.1 or > 0.10-SNAPSHOT? > > On Mon, Oct 19, 2015 at 5:05 PM, Maximilian Michels > wrote: > > Hi Jakob, > > > > Thanks. Flink allocates its ne

Re: Custom TimestampExtractor and FlinkKafkaConsumer082

2015-11-16 Thread Gyula Fóra
Could this part of the extractor be the problem Aljoscha? @Override public long getCurrentWatermark() { return Long.MIN_VALUE; } Gyula Konstantin Knauf ezt írta (időpont: 2015. nov. 16., H, 10:39): > Hi Aljoscha, > > thanks for your answer. Yes I am using the same TimestampExtr

Re: ReduceByKeyAndWindow in Flink

2015-11-23 Thread Gyula Fóra
Hi, Alright it seems there are multiple ways of doing this. I would do something like: ds.keyBy(key) .timeWindow(w) .reduce(...) .timeWindowAll(w) .reduce(...) Maybe Aljoscha could jump in here :D Cheers, Gyula Fabian Hueske ezt írta (időpont: 2015. nov. 23., H, 11:21): > If you set the key

Re: ReduceByKeyAndWindow in Flink

2015-11-23 Thread Gyula Fóra
s the latency, because the > timeWindowAll has to wait for the next timeWindow before it can close > the previous one. So if the first timeWindow is 10s, it takes 20s until > you have a result, although it cant change after 10s. You know what I mean? > > Cheers, > > Konst

Re: Cancel Streaming Job

2015-11-23 Thread Gyula Fóra
Hi! This issue has been fixed very recently and the fix will go into the upcoming bugfix release. (0.10.1) Should be out in the next few days :) Cheers Gyula On Tue, Nov 24, 2015 at 4:49 AM Welly Tambunan wrote: > Hi All, > > Finally i've found the solution for killing the job manager. > > > h

Re: Custom TimestampExtractor and FlinkKafkaConsumer082

2015-11-30 Thread Gyula Fóra
nterval setting. > >>>>>>>> > >>>>>>>> But that should not be the cause of the problem that you > currently have. Would you maybe be willing to send me some (mock) example > data and the code so that I can reproduce the problem and h

Re: NPE with Flink Streaming from Kafka

2015-12-01 Thread Gyula Fóra
Hi, I think Robert meant to write setting the connector dependency to 1.0-SNAPSHOT. Cheers, Gyula Robert Metzger ezt írta (időpont: 2015. dec. 1., K, 17:10): > Hi Mihail, > > the issue is actually a bug in Kafka. We have a JIRA in Flink for this as > well: https://issues.apache.org/jira/browse

Re: NPE with Flink Streaming from Kafka

2015-12-02 Thread Gyula Fóra
2015-12-01 17:42 GMT+01:00 Maximilian Michels : >> > Thanks! I've linked the issue in JIRA. >> > >> > On Tue, Dec 1, 2015 at 5:39 PM, Robert Metzger >> wrote: >> > > I think its this one https://issues.apache.org/jira/browse/KAFKA-824 >> >

Re: release of task slot

2016-02-04 Thread Gyula Fóra
Hey, I am actually facing a similar issue lately, where the job manager release the task slots as it cannot contact the taskmanager. Meanwhile the taskmanager is also trying to connect to the Jobmanager and fails multiple times. This happens on multiple taskmanagers seemingly randomly. So the TM

Re: release of task slot

2016-02-04 Thread Gyula Fóra
Yes exactly , it says it is quarantined. Gyula Gyula On Thu, Feb 4, 2016 at 4:09 PM Stephan Ewen wrote: > @Gyula Do you see log messages about quarantined actor systems? > > There may be an issue with Akka Death watches that once the connection is > lost, it cannot be re-established unless the

Re: release of task slot

2016-02-05 Thread Gyula Fóra
gt;> >> Would be good to also have Till's input on this... >> >> What do you think? >> >> Stephan >> >> >> >> On Thu, Feb 4, 2016 at 5:11 PM, Gyula Fóra wrote: >> >>> Yes exactly , it says it is quarantined. >>>

Re: streaming using DeserializationSchema

2016-02-11 Thread Gyula Fóra
Hey, A very simple thing you could do is to set up a simple kafka producer in a java program that will feed the data into a topic. This also has the additional benefit that you are actually testing against kafka. Cheers, Gyula Martin Neumann ezt írta (időpont: 2016. febr. 12., P, 0:20): > Hej,

Re: Newbie question

2016-02-14 Thread Gyula Fóra
Hi Renato, First of all to do anything together on the two streams you probably want to union them. This means that you need to have a common type. If this is the case you are lucky and you don't need anything else. Otherwise I suggest using the Either type provided by Flink as a simple wrapper.

Kafka issue

2016-02-26 Thread Gyula Fóra
Hey, For one of our jobs we ran into this issue. It's probably some dependency issue but we cant figure it out as a very similar setup works without issues for a different program. java.lang.NoSuchMethodError: scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object; at kafka.consumer.FetchR

Re: Kafka issue

2016-02-26 Thread Gyula Fóra
> Till > > On Fri, Feb 26, 2016 at 10:09 AM, Gyula Fóra wrote: > >> Hey, >> >> For one of our jobs we ran into this issue. It's probably some dependency >> issue but we cant figure it out as a very similar setup works without >> issues for a different

Re: Kafka issue

2016-02-26 Thread Gyula Fóra
re incorrect > > On Fri, Feb 26, 2016 at 10:45 AM, Gyula Fóra wrote: > >> I am not sure what is happening. I tried running against a Flink cluster >> that is definitely running the correct Scala version (2.10) and I still got >> the error. So it might be something with the pom

Re: Kafka issue

2016-02-26 Thread Gyula Fóra
That actually seemed to be the issue, not that I compiled my own version it doesnt have these wrond jars in the dependency tree... Gyula Fóra ezt írta (időpont: 2016. febr. 26., P, 11:01): > I was using the snapshot repo in this case, let me try building my own > version... > > M

Re: Kafka issue

2016-02-26 Thread Gyula Fóra
Thanks Robert, so apparently the snapshot version was screwed up somehow and included the 2.11 dependencies. Now it works. Cheers, Gyula Gyula Fóra ezt írta (időpont: 2016. febr. 26., P, 11:09): > That actually seemed to be the issue, not that I compiled my own version > it doesnt have

Re: Counting tuples within a window in Flink Stream

2016-02-26 Thread Gyula Fóra
Hey, I am wondering if the following code will result in identical but more efficient (parallel): input.keyBy(assignRandomKey).window(Time.seconds(10) ).reduce(count).timeWindowAll(Time.seconds(10)).reduce(count) Effectively just assigning random keys to do the preaggregation and then do a windo

Re: Kafka issue

2016-03-03 Thread Gyula Fóra
Hey, Do we have any idea why this is happening in the snapshot repo? We have run into the same issue again... Cheers, Gyula Gyula Fóra ezt írta (időpont: 2016. febr. 26., P, 11:17): > Thanks Robert, so apparently the snapshot version was screwed up somehow > and included the 2.11 depend

Re: Kafka issue

2016-03-03 Thread Gyula Fóra
2.11 is broken > for the Kafka connector. The reason is that a property value is not > properly resolved and thus pulls in the 2.10 Kafka dependencies. Max > already opened a PR to fix this problem. I hope this will also solve your > problem. > > Cheers, > Till > > On

Re: Clear irrelevant state values

2016-04-25 Thread Gyula Fóra
Hi, (a) I think your understanding is correct, one consideration might be that if you are always sending the state to the sink, it might make sense to build it there directly using a RichSinkFunction. (b) There is no built-in support for this at the moment. What you can do yourself is to generate

Re: Clear irrelevant state values

2016-04-25 Thread Gyula Fóra
arkers? > 2. My understanding is this implementation of custom state maintenance > does not impact scalabiity. Is that right? > > Thanks, > Sowmya > > On Mon, Apr 25, 2016 at 3:06 PM, Gyula Fóra wrote: > >> Hi, >> >> (a) I think your understanding is correct, one

Re: Regarding Broadcast of datasets in streaming context

2016-04-28 Thread Gyula Fóra
Hi Biplob, I have implemented a similar algorithm as Aljoscha mentioned. First things to clarify are the following: There is currently no abstraction for keeping objects (in you case centroids) in a centralized way that can be updated/read by all operators. This would probably be very costly and

Re: Regarding Broadcast of datasets in streaming context

2016-05-02 Thread Gyula Fóra
example from this > > following link: > > > https://github.com/stratosphere/stratosphere/blob/master/stratosphere-examples/stratosphere-java-examples/src/main/java/eu/stratosphere/example/java/clustering/KMeans.java > > > > and I understand how this is working .. bu

Re: Regarding Broadcast of datasets in streaming context

2016-05-04 Thread Gyula Fóra
e set? Thus, updating a centroid and collecting it back for the > next point in the iteration. > > I may not be getting the concept properly here, so an example snippet would > help in a long run. > > Thanks & Regards > Biplob > Gyula Fóra wrote > > Hey, > > >

Re: Regarding Broadcast of datasets in streaming context

2016-05-15 Thread Gyula Fóra
Hi, If you haven't done so far please read the respective part of the the streaming docs: https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/index.html#iterations Iterations just allow you to define cyclic flows, there is nothing magic about it. If your original input stream is

Re: [DISCUSS] Allowed Lateness in Flink

2016-05-30 Thread Gyula Fóra
Thanks Aljoscha :) I added some comments that might seem relevant from the users point of view. Gyula Aljoscha Krettek ezt írta (időpont: 2016. máj. 30., H, 10:33): > Hi, > I created a new doc specifically about the interplay of lateness and > window state garbage collection: > https://docs.goo

Using double quotes for SQL identifiers

2021-01-26 Thread Gyula Fóra
Hi All! Is it possible in any way to configure the Table environments to allow double quotes (") to be used for identifiers instead of backticks (`). Thank you! Gyula

Re: Using double quotes for SQL identifiers

2021-01-26 Thread Gyula Fóra
a, > > > > AFAIK, except the sql-dialect, table API does not expose any parser > > related configuration to the user. > > But we still can change the config of quoting identifiers in parser with > > some code changing. > > You can reference this test cla

SHOW CREATE TABLE in Flink SQL

2020-03-02 Thread Gyula Fóra
Hi All! I am looking for the functionality to show how a table was created or show all the properties (connector, etc.) I could only find DESCRIBE at this point which only shows the schema. Is there anything similar to "SHOW CREATE TABLE" or is this something that we should maybe add in the futu

Re: SHOW CREATE TABLE in Flink SQL

2020-03-02 Thread Gyula Fóra
table" in the future but is > an orthogonal requirement. > > Best, > Jark > > > [1]: https://issues.apache.org/jira/browse/FLINK-16384 > > On Mon, 2 Mar 2020 at 22:09, Jeff Zhang wrote: > >> +1 for this, maybe we can add 'describe extended table'

CREATE TABLE with Schema derived from format

2020-03-04 Thread Gyula Fóra
Hi All! I am wondering if it would be possible to change the CREATE TABLE statement so that it would also work without specifying any columns. The format generally defines the available columns so maybe we could simply use them as is if we want. This would be very helpful when exploring differen

Re: CREATE TABLE with Schema derived from format

2020-03-04 Thread Gyula Fóra
INK-16420 to track this effort. But not sure we have enough > time to support it before 1.11. > > Best, > Jark > > [1]: https://issues.apache.org/jira/browse/FLINK-16420 > > > On Wed, 4 Mar 2020 at 18:21, Gyula Fóra wrote: > >> Hi All! >> >> I am wonder

Re: CREATE TABLE with Schema derived from format

2020-03-04 Thread Gyula Fóra
it, and the CREATE TABLE statement can leave out schema part, e.g. > > CREATE TABLE user_behavior WITH ("connector"="kafka", > "topic"="user_behavior", "schema.registery.url"="localhost:8081") > > Which way are you looking for?

How to override flink-conf parameters for SQL Client

2020-03-05 Thread Gyula Fóra
Hi All! I am trying to understand if there is any way to override flink configuration parameters when starting the SQL Client. It seems that the only way to pass any parameters is through the environment yaml. There I found 2 possible routes: configuration: this doesn't work as it only sets Tab

Re: How to override flink-conf parameters for SQL Client

2020-03-05 Thread Gyula Fóra
; SQL client yaml file can only override some of the Flink configurations. > > Configuration entries indeed can only set Table specific configs, while > deployment entires are used to set the result fetching address and port. > There is currently no way to change the execution target fro

Re: How to override flink-conf parameters for SQL Client

2020-03-05 Thread Gyula Fóra
ver, SQL CLI only allows to set Table >> specific configs. >> I will think it as a bug/improvement of SQL CLI which should be fixed in >> 1.10.1. >> >> Best, >> Jark >> >> On Thu, 5 Mar 2020 at 18:12, Gyula Fóra wrote: >> >>> Thanks C

Writing retract streams to Kafka

2020-03-05 Thread Gyula Fóra
Hi All! Excuse my stupid question, I am pretty new to the Table/SQL API and I am trying to play around with it implementing and running a few use-cases. I have a simple window join + aggregation, grouped on some id that I want to write to Kafka but I am hitting the following error: "AppendStream

Re: Writing retract streams to Kafka

2020-03-05 Thread Gyula Fóra
g what could I do to just simply pump the result updates to Kafka here. Gyula On Thu, Mar 5, 2020 at 2:37 PM Khachatryan Roman < khachatryan.ro...@gmail.com> wrote: > Hi Gyula, > > Could you provide the code of your Flink program, the error with > stacktrace and the Flink version?

Re: Writing retract streams to Kafka

2020-03-05 Thread Gyula Fóra
://ci.apache.org/projects/flink/flink-docs-release-1.10/api/java/org/apache/flink/streaming/connectors/kafka/KafkaTableSink.html > > On Thu, Mar 5, 2020 at 2:50 PM Gyula Fóra wrote: > >> Hi Roman, >> >> This is the core logic: >> >> CREATE TABLE QueryResult (

Re: Writing retract streams to Kafka

2020-03-05 Thread Gyula Fóra
window is calculated and fired. >> But with some other arbitrary aggregations, there is not enough >> information for Flink to determine whether >> the data is complete or not, so the framework will keep calculating >> results when receiving new records and >> retract ea

Re: Writing retract streams to Kafka

2020-03-05 Thread Gyula Fóra
e > groups like (itemId, eventtime, queryId) have complete data or not. > As a comparison, if you change the grouping key to a window which based > only on q.event_time, then the query would emit insert only results. > > Best, > Kurt > > > On Thu, Mar 5, 2020 at 10:29 PM

Re: How to override flink-conf parameters for SQL Client

2020-03-05 Thread Gyula Fóra
n, like visualization. >> >> If you are interested, you could try the master branch of Zeppelin + this >> improvement PR >> >> https://github.com/apache/zeppelin >> https://github.com/apache/zeppelin/pull/3676 >> https://github.com/apache/zeppelin/blob/maste

Re: How to override flink-conf parameters for SQL Client

2020-03-06 Thread Gyula Fóra
mit another >> job to this cluster, then all the configurations >> relates to process parameters like TM memory, slot number etc are not be >> able to modify. >> >> Best, >> Kurt >> >> >> On Thu, Mar 5, 2020 at 11:08 PM Gyula Fóra wrote: >> >>

Re: Writing retract streams to Kafka

2020-03-06 Thread Gyula Fóra
Thanks Kurt, I came to the same conclusions after trying what Jark provided. I can get similar behaviour if I reduce the grouping window to 1 sec but still keep the join window large. Gyula On Fri, Mar 6, 2020 at 3:09 PM Kurt Young wrote: > @Gyula Fóra I think your query is right, we sho

Kerberos authentication for SQL CLI

2020-03-24 Thread Gyula Fóra
Hi! Does the SQL CLI support Kerberos Authentication? I am struggling to find any use of the SecurityContext in the SQL CLI logic but maybe I am looking in the wrong place. Thank you! Gyula

Re: Kerberos authentication for SQL CLI

2020-03-24 Thread Gyula Fóra
r. Therefore if you > kerberize the cluster the queries will use that configuration. > > On a different note. Out of curiosity. What would you expect the SQL CLI > to use the Kerberos authentication for? > > Best, > > Dawid > > On 24/03/2020 11:11, Gyula Fóra wrote

Inserting nullable data into NOT NULL columns

2020-04-09 Thread Gyula Fóra
Hi All! We ran into a problem while trying to insert data read from kafka into a table sink where some of the columns are not nullable. The problem is that from Kafka we can only read nullable columns in JSON format otherwise you get the following error: org.apache.flink.table.api.ValidationExce

Re: Inserting nullable data into NOT NULL columns

2020-04-10 Thread Gyula Fóra
`NULLIF()` should do the trick in the query but unfortunately > the current Calcite behavior is not what one would expect. > > Thanks, > Timo > > > On 09.04.20 15:53, Gyula Fóra wrote: > > Hi All! > > > > We ran into a problem while trying to insert data rea

Joining table with row attribute against an enrichment table

2020-04-20 Thread Gyula Fóra
Hi All! We hit a the following problem with SQL and trying to understand if there is a valid workaround. We have 2 tables: *Kafka* timestamp (ROWTIME) item quantity *Hive* item price So we basically have incoming (ts, id, quantity) and we want to join it with the hive table to get the total pr

Re: Joining table with row attribute against an enrichment table

2020-04-20 Thread Gyula Fóra
lease-1.10/dev/table/streaming/joins.html#join-with-a-temporal-table > > Best, > Godfrey > > Gyula Fóra 于2020年4月20日周一 下午4:46写道: > >> Hi All! >> >> We hit a the following problem with SQL and trying to understand if there >> is a valid workaround. >&g

Re: Joining table with row attribute against an enrichment table

2020-04-20 Thread Gyula Fóra
ng to join them, Flink will assume both table will be changing with > time. > > Best, > Kurt > > > On Mon, Apr 20, 2020 at 9:48 PM Gyula Fóra wrote: > >> Hi! >> >> The problem here is that I dont have a temporal table. >> >> I have a regula

Re: Joining table with row attribute against an enrichment table

2020-04-20 Thread Gyula Fóra
> and EnvironmentSettings's batchMode or streamingMode (newer versions). > > But we should admit that Flink hasn't finish the unification work. Your > case will also be considered in the > future when we want to further unify and simplify these concepts and > us

Re: Joining table with row attribute against an enrichment table

2020-04-20 Thread Gyula Fóra
by. And while checking the time attributes we would need to know > which table is bounded and what kind of changes are coming into the > streaming table. > > There is still a lot of work in the future to make the concepts smoother. > > Regards, > Timo > > > [0] https://is

Cannot map nested Tuple fields to table columns

2020-04-27 Thread Gyula Fóra
Hi All! I was trying to flatten a nested tuple into named columns with the fromDataStream method and I hit some problems with mapping tuple fields to column names. It seems like the `f0 as ColumnName` kind of expressions are not parsed correctly. It is very easy to reproduce: tableEnv.fromDataSt

Re: Cannot map nested Tuple fields to table columns

2020-04-27 Thread Gyula Fóra
Hi Leonard, The tuple fields can also be referenced as their POJO names (f0, f1), they can be reordered similar to pojo fields, however you cannot alias them. (If you look at the link I have sent that shows how it is supposed to work but it throws an exception when I try it) Also what I am trying

Re: Cannot map nested Tuple fields to table columns

2020-04-27 Thread Gyula Fóra
olvedCallExpression) > f).getFunctionDefinition() == BuiltInFunctionDefinitions.AS &&* > *f.getChildren().get(0) instanceof > UnresolvedReferenceExpression) {* > *return false;* > *}* > > if (f instanceof UnresolvedReferenceExpression

Converting String/boxed-primitive Array columns back to DataStream

2020-04-28 Thread Gyula Fóra
Hi All! I have a Table with columns of ARRAY and ARRAY, is there any way to convert it back to the respective java arrays? String[] and Integer[] It only seems to work for primitive types (non null), date, time and decimal. For String for instance I get the following error: Query schema: [f0: AR

Re: Converting String/boxed-primitive Array columns back to DataStream

2020-04-28 Thread Gyula Fóra
mapping between old TypeInformation and new DataType > system. A back and forth conversion should work between those types. > > Regards, > Timo > > On 28.04.20 15:36, Gyula Fóra wrote: > > Hi All! > > > > I have a Table with columns of ARRAY and ARRAY, is there >

Re: Converting String/boxed-primitive Array columns back to DataStream

2020-04-28 Thread Gyula Fóra
you? The other methods take > TypeInformation and might cause this problem. It is definitely a bug. > > Feel free to open an issue under: > https://issues.apache.org/jira/browse/FLINK-12251 > > Regards, > Timo > > On 28.04.20 18:44, Gyula Fóra wrote: > > Hi Timo, >

Using logicalType in the Avro table format

2020-04-29 Thread Gyula Fóra
Hi All! We are trying to work with avro serialized data from Kafka using the Table API and use TIMESTAMP column type. According to the docs , we can use long type with logicalType: timestamp-mi

Re: Using logicalType in the Avro table format

2020-04-30 Thread Gyula Fóra
ation about the bridging class > (java.sql.Timestamp in this case) is lost in the stack. Because this > information is lost/not respected the planner produces LocalDateTime > instead of a proper java.sql.Timestamp time. The AvroRowSerializationSchema > expects java.sql.Timestamp for a colum

Re: Using logicalType in the Avro table format

2020-04-30 Thread Gyula Fóra
ma being correctly > translated to a specific record that uses the new TimeConversions [1]. > > [1] > https://github.com/apache/avro/blob/master/lang/java/avro/src/main/java/org/apache/avro/data/TimeConversions.java > > On Thu, Apr 30, 2020 at 10:41 AM Gyula Fóra wrote: > >&

Re: Re: [DISCUSS] Upgrade HBase connector to 2.2.x

2020-06-18 Thread Gyula Fóra
r@ and user-zh@ to hear more feedback from users. > > Best, > Jark > > On Thu, 18 Jun 2020 at 21:25, Gyula Fóra wrote: > >> Hi All! >> >> I would like to revive an old ticket >> <https://issues.apache.org/jira/browse/FLINK-9849> and discussion around >

Re: [DISCUSS] Upgrade HBase connector to 2.2.x

2020-06-22 Thread Gyula Fóra
If we were to go the bahir route, I don't see the point in migrating the 1.4.x version there since that's already available in Flink. To me that is almost the same as dropping explicit support for 1.4 and telling users to use older connector versions if they wish to keep using it. If we want to ke

Re: [DISCUSS] Upgrade HBase connector to 2.2.x

2020-08-07 Thread Gyula Fóra
stuff into Flink, and legacy, >> experimental or unstable connectors into Bahir. >> >> >> Who can take care of this effort? (Decide which Hbase 2 PR to take, >> review and contribution to Bahir) >> >> >> [1] https://issues.apache.org/jira/browse/FLINK-18

Custom log appender for YARN

2019-07-31 Thread Gyula Fóra
Hi All! We are trying to configure a custom Kafka log appender for our YARN application and we hit the following problem. We included the log appender dependency in the fatjar of the application because in YARN that should be part of the system class path. However when the YARN cluster entrypoin

Re: Custom log appender for YARN

2019-07-31 Thread Gyula Fóra
a/lang/ClassLoader.html > 2. > https://ci.apache.org/projects/flink/flink-docs-master/monitoring/debugging_classloading.html > > > Thanks, > Biao /'bɪ.aʊ/ > > > > On Wed, Jul 31, 2019 at 9:21 PM Gyula Fóra wrote: > >> Hi All! >> >> We are tryi

Re: BucketingSink - Could not invoke truncate while recovering from state

2019-08-27 Thread Gyula Fóra
Hi all! I am gonna try to resurrect this thread as I think I have hit the same issue with the StreamingFileSink: https://issues.apache.org/jira/browse/FLINK-13874 I don't have a good answer but it seems that we try to truncate before we get the lease (even though there is logic both in BucketingS

Re: BucketingSink - Could not invoke truncate while recovering from state

2019-08-27 Thread Gyula Fóra
e from the same task and the lease is not > "reentrant"? > > Cheers, > Kostas > > On Tue, Aug 27, 2019 at 4:53 PM Gyula Fóra wrote: > > > > Hi all! > > > > I am gonna try to resurrect this thread as I think I have hit the same > issue with the Streaming

Comparing Storm and Flink resource requirements

2019-10-21 Thread Gyula Fóra
Hi All! I would like to ask the community for any experience regarding migration from Storm to Flink production applications. Specifically I am interested in your experience related to the resource requirements for the same pipeline as implemented in Flink vs in Storm. The design of the applicati

Re: Comparing Storm and Flink resource requirements

2019-10-22 Thread Gyula Fóra
es like state. > Our state was small at the time, and the main business was real-time ETL. > If it is a different type of business, the problem may be more complicated > and may require a specific analysis of the specific problem. > > Best, > Vino > > Gyula Fóra 于2019年10月21日周

Re: [DISCUSS] Semantic and implementation of per-job mode

2019-10-31 Thread Gyula Fóra
Hi all, Regarding the compilation part: I think there are up and downsides to building the Flink job (running the main method) on the client side, however since this is the current way of doing it we should have a very powerful reason to change the default behaviour. While there is a possible wor

Re: PreAggregate operator with timeout trigger

2019-11-05 Thread Gyula Fóra
You might have to introduce some dummy keys for a more robust solution that integrates with the fault-tolerance mechanism. Gyula On Tue, Nov 5, 2019 at 9:57 AM Felipe Gutierrez < felipe.o.gutier...@gmail.com> wrote: > Thanks Piotr, > > the thing is that I am on Stream data and not on keyed strea

Re: What happens to a Source's Operator State if it stops being initialized and snapshotted? Accidentally exponential?

2019-11-27 Thread Gyula Fóra
You are right Aaron. I would say this is like this by design as Flink doesn't require you to initialize state in the open method so it has no safe way to delete the non-referenced ones. What you can do is restore the state and clear it on all operators and not reference it again. I know this feel

Re: ArrayIndexOutOfBoundException on checkpoint creation

2019-11-29 Thread Gyula Fóra
Hi Theo! I have not seen this error before however I have encountered many strange things when using Kryo for serialization. From the stack trace it seems that this might indeed be a Kryo related issue. I am not sure what it is but what I would try is to change the state serializers to a non Kryo

Re: [DISCUSSION] Using `DataStreamUtils.reinterpretAsKeyedStream` on a pre-partitioned Kafka topic

2019-12-01 Thread Gyula Fóra
Hi! As far as I know, even if you prepartition the data exactly the same way in kafka using the key groups, you have no guarantee that the kafka consumer source would pick up the right partitions. Maybe if you have exactly as many kafka partitions as keygroups/max parallelism, partitioned corre

Custom metrics reporter classloading problem

2018-07-11 Thread Gyula Fóra
Hi all, I have ran into the following problem and I want to double check wether this is intended behaviour. I have a custom metrics reporter that pushes things to Kafka (so it creates a KafkaProducer in the open method etc.etc.) for my streaming job. Naturally as my Flink job consumes from Kafka

Re: Custom metrics reporter classloading problem

2018-07-11 Thread Gyula Fóra
JM/TM starts, i.e. before any user-code is > even accessible. > > My recommendation would be to either put the kafka dependencies in the > /lib folder or try to relocate the kafka code in the reporter. > > On 11.07.2018 14:59, Gyula Fóra wrote: > > Hi all, > > > > I

External checkpoint metadata in Flink 1.5.x

2018-07-12 Thread Gyula Fóra
Hi, It seems that the behaviour to store the checkpoint metadata files for externalized checkpoints changed from 1.4 to 1.5 and the docs seem to be incorrectly saying that: "state.checkpoints.dir: The target directory for meta data of externalized checkpoints

Re: External checkpoint metadata in Flink 1.5.x

2018-07-12 Thread Gyula Fóra
issue which aims at implementing a solution: > https://issues.apache.org/jira/browse/FLINK-9114. > > I quickly talked to Stephan, it seems to be that the meta info about > externalized checkpoints is also written to the HA storage directory, maybe > that's helpful for you. > >

Re: External checkpoint metadata in Flink 1.5.x

2018-07-13 Thread Gyula Fóra
checkpoints are in a central location? > > Best, > Aljoscha > > > On 12. Jul 2018, at 17:55, Gyula Fóra wrote: > > Hi! > > Well it depends on how we look at it FLINK-5627 > <https://issues.apache.org/jira/browse/FLINK-5627> is not necessarily > the c

Events can overtake watermarks

2018-07-22 Thread Gyula Fóra
Hi, In 1.5.1 I have noticed some strange behaviour that happens quite frequently and I just want to double check with you that this is intended. If I have a non-parallel source that takes the following actions: emit: event1 emit: watermark1 emit: event2 it can happen that a downstream operators

Re: Events can overtake watermarks

2018-07-23 Thread Gyula Fóra
ike a „wrong“ behaviour, only > watermarks overtaking events would be bad. Do you think this only stated > from Flink 1.5? To me this does not sound like a problem, but not sure if > it is intended. Looping in Aljoscha, just in case. > > Best, > Stefan > > > Am 22.07.2018 u

Re: Events can overtake watermarks

2018-07-23 Thread Gyula Fóra
even though the other one should also have an element. Gyula Gyula Fóra ezt írta (időpont: 2018. júl. 23., H, 10:44): > Hi guys, > > Let me clarify. There is a single source with parallelism 1 and a single > downstream operator with parallelism > 1. > So the watermark is strictly

Re: Events can overtake watermarks

2018-07-23 Thread Gyula Fóra
Yea, now that I think about it, thats probably the case. Sorry to bother :) Gyula Gyula Fóra ezt írta (időpont: 2018. júl. 23., H, 11:04): > Hm I wonder it could be because the downstream operator is a 2 input > operator and I do some filtering on the source elements to direct to one of

Re: Some questions about the StreamingFileSink

2018-08-22 Thread Gyula Fóra
Hi Kostas, Sorry for jumping in on this discussion :) What you suggest for finite sources and waiting for checkpoints is pretty ugly in many cases. Especially if you would otherwise read from a finite source (a file for instance) and want to end the job asap. Would it make sense to not discard a

Re: 1.5 Checkpoint metadata location

2018-09-25 Thread Gyula Fóra
Yes, the only workaround I found at the end was to restore the previous behavior where metadata files are written separately. But for this you need a custom Flink build with the changes to the check pointing logic. Gyula On Tue, 25 Sep 2018 at 16:45, Till Rohrmann wrote: > Hi Bryant, > > I thin

Re: Ship compiled code with broadcast stream ?

2018-10-09 Thread Gyula Fóra
Hi, This is certainly possible. What you can do is use a BroadcastProcessFunction where you receive the rule code on the broadcast side. You probably cannot send newly compiled objects this way but what you can do is either send a reference to some compiled jars and load them with the URLClassloa

Re: Ship compiled code with broadcast stream ?

2018-10-09 Thread Gyula Fóra
You should not try sending the compiled code anywhere but you can use it from within the processor. You can do the same thing with the jar, you compile your jar, store it on HDFS. Send the jar path to the processor which can download the jar and instantiate the rule. No need to resubmit the job.

Re: Take RocksDB state dump

2018-10-17 Thread Gyula Fóra
Hi, If you dont mind a little trying out stuff I have some nice tooling for exactly this: https://github.com/king/bravo Let me know if it works :) Gyula Harshvardhan Agrawal ezt írta (időpont: 2018. okt. 17., Sze, 21:50): > Hello, > > We are currently using a RocksDBStateBackend for our Flin

State processing and testing utilities (Bravo)

2018-11-07 Thread Gyula Fóra
Hey all! I just wanted to give you a quick update on the bravo project. Bravo contains a bunch of useful utilities for processing the checkpoint/savepoint state of a streaming job as Flink Datasets (batch). The end goal of the project is to be contributed to Flink once we are happy with it but fo

Re: [ANNOUNCE] Dropping "CheckpointConfig.setPreferCheckpointForRecovery()"

2021-08-25 Thread Gyula Fóra
Hi Stephan, I do not know if anyone is still relying on this but I think it makes sense to drop this feature. So +1 from me. I think it served a valid purpose originally but if we have a good improvement in the pipeline using the savepoints directly that will solve the problem properly. I would c

Re: Flink native k8s integration vs. operator

2022-01-10 Thread Gyula Fóra
Hi All! This is a very interesting discussion. I think many users find it confusing what deployment mode to choose when considering a new production application on Kubernetes. With all the options of native, standalone and different operators this can get tricky :) I really like the idea that Th

Re: Flink native k8s integration vs. operator

2022-01-17 Thread Gyula Fóra
n and sorry for chiming in > >>> late. > >>> >> > > >>> >> > I agree with Thomas' and David's assessment of Flink's "Native > >>> >> Kubernetes > >>> >> > Integration",

Using port ranges to connect with the Flink Client

2018-12-04 Thread Gyula Fóra
Hi! We have been running Flink on Yarn for quite some time and historically we specified port ranges so that the client can access the cluster: yarn.application-master.port: 100-200 Now we updated to flink 1.7 and try to migrate away from the legacy execution mode but we run into a problem that

Re: Using port ranges to connect with the Flink Client

2018-12-05 Thread Gyula Fóra
ports to the open ranges but not sure what I have to change. Gyula Gyula Fóra ezt írta (időpont: 2018. dec. 4., K, 15:11): > Hi! > > We have been running Flink on Yarn for quite some time and historically we > specified port ranges so that the client can access the cluster: > >

Re: Using port ranges to connect with the Flink Client

2018-12-05 Thread Gyula Fóra
Ah, it seems to be something with the custom flink client build that we run... Still dont know why but if I use the normal client once the job is started it works. Gyula Gyula Fóra ezt írta (időpont: 2018. dec. 5., Sze, 9:50): > I get the following error when trying to savepoint a job

  1   2   3   4   5   >