Putting record on kinesis (actually kinesalite) results with The security token included in the request is invalid.

2020-12-10 Thread Avi Levi
Hi , Any help here will be greatly appreciated I am about to throw the towel, very frustrating... I am trying to put record on kinesalite with the following configuration : System.setProperty(com.amazonaws.SDKGlobalConfiguration.AWS_CBOR_DISABLE_SYSTEM_PROPERTY, "true") System.setProperty(SDK

Re: Putting record on kinesis (actually kinesalite) results with The security token included in the request is invalid.

2020-12-10 Thread Avi Levi
aven't worked with the > KinesisConsumer. Unfortenately, I cannot judge whether there's something > missing in your setup. But first of all: Could you confirm that the key > itself is valid? Did you try to use it in other cases? > > Best, > Matthias > > On Thu,

Re: Putting record on kinesis (actually kinesalite) results with The security token included in the request is invalid.

2020-12-11 Thread Avi Levi
(relying on the Flink docs [1] > here). Have you considered asking in the AWS community for help? > > Best, > Matthias > > [1] > https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kinesis.html#kinesis-producer > > On Thu, Dec 10, 2020 at 6:31 PM Avi Levi wr

Never terminating test ...

2020-12-13 Thread Avi Levi
I have the following test. the problem is it doesn't end ... meaning it doesn't reach the assertion point. What am I doing wrong? "kinesis consumer" should "consume message from kinesis stream" in { import ExecutionContext.Implicits.global val sampleData = Seq("a", "b", "c") val env: S

Connecting to kinesis with mfa

2020-12-15 Thread Avi Levi
Hi guys, we are struggling to connect to kinesis when mfa is activated. I did configured everything according to the documentation but still getting exception : val producerConfig = new Properties() producerConfig.put(AWSConfigConstants.AWS_REGION, awsRegion) producerConfig.put(AWSConfigConstants

Re: Connecting to kinesis with mfa

2020-12-16 Thread Avi Levi
f them? > > https://docs.aws.amazon.com/credref/latest/refdocs/setting-global-aws_session_token.html > > I'll also ping some AWS contributors active in Flink to take a look at > this. > > Best, > Robert > > On Tue, Dec 15, 2020 at 10:07 AM Avi Levi wrote: > >> Hi guys, >

Re: Connecting to kinesis with mfa

2020-12-16 Thread Avi Levi
roperties(); > producerConfig.setProperty(AWSConfigConstants.*AWS_REGION*, *REGION*); > producerConfig.setProperty(AWSConfigConstants.*AWS_CREDENTIALS_PROVIDER*, > "SYS_PROP"); > > > > I will add this to the Jira also. Let me know if you have any issues. > > > >

Just published connect-flink-with-kinesis-kinesalite-using-scala

2020-12-23 Thread Avi Levi
Hi , After stumbling a little with connecting to kinesis/kinesalite I just published connect-flink-with-kinesis-kinesalite-using-scala hopefully it will assist someone. would love to get your inputtes

Getting Exception in thread "main" java.util.concurrent.ExecutionException: scala.tools.reflect.ToolBoxError: reflective compilation has failed: cannot initialize the compiler

2020-12-28 Thread Avi Levi
I am trying to aggregate all records in a time window. This is my ProcessAllWindowFunction : case class SimpleAggregate(elms: List[String]) class AggregateLogs extends ProcessAllWindowFunction[String, SimpleAggregate, TimeWindow ] { override def process(context: Context, elements: Iterable[Stri

reading file from s3

2021-03-04 Thread Avi Levi
Hi , I am pretty new. I am keep on struggling to read a file from s3 but getting this weird exception : Caused by: java.lang.NumberFormatException: For input string: "64M" (if anyone can link me to a working github example that will be awesome) . what am i doing wrong? This is how my code looks li

Re: reading file from s3

2021-03-04 Thread Avi Levi
eStreamTask.java:241) On Thu, Mar 4, 2021 at 6:02 PM Chesnay Schepler wrote: > Can you show us the full exception stacktrace? Intuitively I would think > your cluster configuration contains an invalid value for some memory > configuration option. > > On 3/4/2021 4:45 PM, Avi Levi w

Re: reading file from s3

2021-03-05 Thread Avi Levi
Does anyone by any chance have a working example (of course without the credentials etc') that can be shared on github ?simply reading/writing a file from/to s3. I keep on struggling with this one and getting weird exceptions Thanks On Thu, Mar 4, 2021 at 7:30 PM Avi Levi wrote: > Sure,

Re: reading file from s3

2021-03-07 Thread Avi Levi
r once you run the container. > If you execute it locally(not in a container) in a standalone cluster, > make sure this env is defined in system level. > > Tamir. > -- > *From:* Tamir Sagi > *Sent:* Saturday, March 6, 2021 7:33 PM > *To:* Avi Levi ;

Filtering lines in parquet

2021-03-10 Thread Avi Levi
Hi all, I am trying to filter lines from parquet files, the problem is that they have different schemas, however the field that I am using to filter exists in all schemas. in spark this is quite straight forward : *val filtered = rawsDF.filter(col("id") != "123")* I tried to do it in flink by ext

Re: Filtering lines in parquet

2021-03-11 Thread Avi Levi
you want to write these records with different > schema into the same parquet file. Maybe, you just want to extract the > common fields of A, B, C? Then you can also use Table API and just declare > the fields that are common. > > Or do you have sink A, B, C and actually 3 separate topologie

Re: Filtering lines in parquet

2021-03-12 Thread Avi Levi
thods/parquet.hadoop.metadata.FileMetaData/getSchema>(); > > Quite possibly that's what Spark is doing under hood. If you open a ticket > with a feature request, we will add it in the future. > > On Thu, Mar 11, 2021 at 6:26 PM Avi Levi wrote: > >> Hi Arvid, >&g

getting an exception

2019-08-05 Thread Avi Levi
Hi, I'm using Flink 1.8.1. our code is mostly using Scala. When I try to submit my job (on my local machine ) it crashes with the error below (BTW on the IDE it runs perfectly). Any assistance would be appreciated. Thanks Avi 2019-08-05 12:58:03.783 [Flink-DispatcherRestEndpoint-thread-3] ERROR or

Re: getting an exception

2019-08-06 Thread Avi Levi
.1 ? It won't > deploy on a 1.8.0 server any more, if that's a concern for you. > > Gaël > > On Mon, Aug 5, 2019 at 4:37 PM Wong Victor > wrote: > >> Hi Avi: >> >> >> >> It seems you are submitting your job with an older Flink version (< 1

need some advice comparing sliding window to a single unit

2019-11-06 Thread Avi Levi
Hi, I want to get the average of the last x hours and compare it to the sum of the current hour. I thought of using ProcessWindowFunction for 8 hours and do the calculation i.e consuming 8 hours of data and group it by the hour and do the math, but it seems very inefficient especially considering t

Idiomatic way to split pipeline

2019-11-25 Thread Avi Levi
Hi, I want to split the output of one of the operators to two pipelines. Since the *split* method is deprecated, what is the idiomatic way to do that without duplicating the operator ? [image: Screen Shot 2019-11-25 at 10.05.38.png]

Re: Idiomatic way to split pipeline

2019-11-25 Thread Avi Levi
ure > to replace it.[1] > > [1]: > https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html > <https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html> > > Best, > Vino > > Avi Levi 于2019年11月25日周一 下午4:12写道: > >>

Re: Idiomatic way to split pipeline

2019-11-25 Thread Avi Levi
ats-the-difference-between-side-outputs-and-split-in-the-data> > > Avi Levi 于2019年11月25日周一 下午5:32写道: > >> Thank you, for your quick reply. I appreciate that. but this it not >> exactly "side output" per se. it is simple splitting. IIUC The side output >> i

Re: Idiomatic way to split pipeline

2019-12-01 Thread Avi Levi
am2 = stream.window(...).addSink() > > In Flink, you can compose arbitrary directed acyclic graphs, so consuming > the output of one operator on several downstream operators is completely > normal. > > Best, > > Arvid > > On Mon, Nov 25, 2019 at 10:50 AM Avi Levi wr

Firing timers on ProcessWindowFunction

2019-12-02 Thread Avi Levi
Hi, Is there a way to fire timer in a ProcessWindowFunction ? I would like to mutate the global state on a timely basis.

Re: Firing timers on ProcessWindowFunction

2019-12-02 Thread Avi Levi
59 AM vino yang wrote: > *This Message originated outside your organization.* > -- > Hi Avi, > > Firstly, let's clarify that the "timer" you said is the timer of the > window? Or a timer you want to register to trigger some action? > >

Re: Firing timers on ProcessWindowFunction

2019-12-02 Thread Avi Levi
I think the only way to do this is to add keyed operator down the stream that will hold the global state. not ideal but I don't see any other option On Mon, Dec 2, 2019 at 1:43 PM Avi Levi wrote: > Hi Vino, > I have a global state that I need to mutate every X hours (e.g clean that

Re: Firing timers on ProcessWindowFunction

2019-12-02 Thread Avi Levi
gt; > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany > > -- > > Ververica GmbH > Registered at Amtsgericht Charlottenburg: HRB 158244 B > Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji > (Tony) Cheng > > > > On Mon, Dec 2, 2

testing - syncing data timeline

2019-12-25 Thread Avi Levi
Hi , I have the following pipeline : 1. single hour window that counts the number of records 2. single day window that accepts the aggregated data from #1 and emits the highest hour count of that day 3. union #1 + #2 4. Logic operator that accepts the data from #3 and keep a listState of #2 and a

Re: testing - syncing data timeline

2019-12-25 Thread Avi Levi
wrote: > *This Message originated outside your organization.* > -- > Hi, > > You can merge the logic of #2 into #4, it will be much simpler. > > Best, > Kurt > > > On Wed, Dec 25, 2019 at 7:36 PM Avi Levi wrote: > >> Hi , >>

ClassNotFoundException: org.apache.kafka.common.metrics.stats.Rate$1

2018-11-20 Thread Avi Levi
looking at the log file of the taskexecutor I see this exception *at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1

Could not extract key Exception only on runtime not in dev environment

2018-11-20 Thread Avi Levi
I am running flink locally on my machine , I am getting the exception below when reading from kafka topic. when running from the ide (intellij) it is running perfectly. however when I deploy my jar to flink runtime (locally) using */bin/flink run ~MyApp-1.0-SNAPSHOT.jar* my class looks like this c

Re: Could not extract key Exception only on runtime not in dev environment

2018-11-20 Thread Avi Levi
ionFactor* *math.abs(h)* *}* *.map(...)* and still the same On Tue, Nov 20, 2018 at 5:31 PM miki haiat wrote: > What r.id Value ? > Are you sure that is not null ? > > Miki. > > > On Tue, 20 Nov 2018, 17:26 Avi Levi >> I am running flink locally on my

your advice please regarding state

2018-11-21 Thread Avi Levi
Hi , I am very new to flink so please be gentle :) *The challenge:* I have a road sensor that should scan billons of cars per day. for starter I want to recognise if each car that passes by is new or not. new cars (never been seen before by that sensor ) will be placed on a different topic on kafk

Re: your advice please regarding state

2018-11-21 Thread Avi Levi
Thanks a lot! got it :) On Wed, Nov 21, 2018 at 11:40 PM Jamie Grier wrote: > Hi Avi, > > The typical approach would be as you've described in #1. #2 is not > necessary -- #1 is already doing basically exactly that. > > -Jamie > > > On Wed, Nov 21, 2018 at

where can I see logs from code

2018-11-25 Thread Avi Levi
Hi, Where can I see the logs written by the app code (i.e by the app developer) ? BR Avi

Re: where can I see logs from code

2018-11-25 Thread Avi Levi
skmanager/TM_ID/log > > > > > > On Sun, Nov 25, 2018 at 12:11 PM Avi Levi wrote: > >> Hi, >> Where can I see the logs written by the app code (i.e by the app >> developer) ? >> >> BR >> Avi >> >

understadning kafka connector - rebalance

2018-11-25 Thread Avi Levi
Hi Looking at this example , doing the "rebalance" (e.g messageStream.rebalance().map(...) ) operation on heavy load stream wouldn't slow the stream ? is the rebalancing action occurs only w

Re: understadning kafka connector - rebalance

2018-11-26 Thread Avi Levi
Koitawala > GS Lab Pune > +91 8407979163 > > > On Mon, Nov 26, 2018 at 11:33 AM Avi Levi wrote: > >> Hi >> Looking at this example >> <https://github.com/dataArtisans/kafka-example/blob/master/src/main/java/com/dataartisans/ReadFromKafka.java>, >> doin

Re: understadning kafka connector - rebalance

2018-11-26 Thread Avi Levi
um 10:59 Uhr schrieb Taher Koitawala < > taher.koitaw...@gslab.com>: > >> You can use rebalance before keyBy because rebalance returns DataStream. >> The API does not allow rebalance on keyedStreamed which is returned after >> keyBy so you are safe. >> >> O

Re: your advice please regarding state

2018-11-27 Thread Avi Levi
Thank you very much. got it. On Tue, Nov 27, 2018 at 12:53 PM Fabian Hueske wrote: > Hi Avi, > > I'd definitely go for approach #1. > Flink will hash partition the records across all nodes. This is basically > the same as a distributed key-value store sharding keys. > I would not try to fine tun

Stream in loop and not getting to sink (Parquet writer )

2018-11-28 Thread Avi Levi
Hi, I am trying to implement Parquet Writer as SinkFunction. The pipeline consists of kafka as source and parquet file as a sink however it seems like the stream is repeating itself like endless loop and the parquet file is not written . can someone please help me with this? object ParquetSinkWri

Re: Stream in loop and not getting to sink (Parquet writer )

2018-11-28 Thread Avi Levi
here for more details: > https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html#kafka-consumers-and-timestamp-extractionwatermark-emission > > Hope this helps, > Rafi > > On Wed, Nov 28, 2018, 21:22 Avi Levi >> Hi, >> >> I am trying to implement P

Re: Stream in loop and not getting to sink (Parquet writer )

2018-11-28 Thread Avi Levi
s"In Sink $r") writer.write(r) } env.execute() // writer.close() } On Thu, Nov 29, 2018 at 6:57 AM vipul singh wrote: > Can you try closing the writer? > > AvroParquetWriter has an internal buffer. Try doing a .close() in > snapshot()( since you are chec

Re: Stream in loop and not getting to sink (Parquet writer )

2018-11-29 Thread Avi Levi
s >> long enough for at least some checkpoints to be completed? >> >> Thanks a lot, >> Kostas >> >> On Thu, Nov 29, 2018 at 7:03 AM Avi Levi wrote: >> >>> Checkout this little App. you can see that the file is created but no >>> data

Re: Stream in loop and not getting to sink (Parquet writer )

2018-11-29 Thread Avi Levi
ould you try to use Flink's Avro - Parquet writer? > > StreamingFileSink.forBulkFormat( > Path...(MY_PATH), > ParquetAvroWriters.forGenericRecord(MY_SCHEMA)) > .build() > > > Cheers, > Kostas > > On Thu, Nov 29, 2018 at 12:25 PM Avi Levi wrote: > >&g

Re: Stream in loop and not getting to sink (Parquet writer )

2018-11-30 Thread Avi Levi
o know more about Flink's Parquet-Avro writer, feel free to >> have a look at the ParquetAvroWriters >> class. >> >> Cheers, >> Kostas >> >> >> On Thu, Nov 29, 2018 at 6:58 PM Avi Levi wrote: >> >>

Re: Stream in loop and not getting to sink (Parquet writer )

2018-12-02 Thread Avi Levi
etingSink. > > In fact the StreamingFIleSink is the "evolution" of the BucketingSink and > it supports > all the functionality that the BucketingSink supports. > > Given this, why not using the StreamingFileSink? > > On Sat, Dec 1, 2018 at 7:56 AM Avi Levi wrote: &

Looking for example for bucketingSink / StreamingFileSink

2018-12-03 Thread Avi Levi
Hi Guys, very new to flink so my apology for the newbie questions :) but I desperately looking for a good example for streaming to file using bucketingSink / StreamingFileSink . Unfortunately the examples in the documentation are not event compiling (at least not the ones in scala https://issues.ap

Re: Stream in loop and not getting to sink (Parquet writer )

2018-12-03 Thread Avi Levi
size. > The part-files roll on every checkpoint. This is a known limitation and > there are plans to > alleviate it in the future. > > Setting the batch size (among other things) is supported for RowWise > formats. > > Cheers, > Kostas > > On Sun, Dec 2, 2018 at 9

Trying to write to parquet file (kafka as a source) yields thousands of "in progress" files

2018-12-09 Thread Avi Levi
Hi, I am trying to read from kafka and write to parquet. But I am getting thousands of ".part-0-0in progress..." files (and counting ...) is that a bug or am I doing something wrong? object StreamParquet extends App { implicit val env: StreamExecutionEnvironment = StreamExecutionEnvironment.getE

Re: Trying to write to parquet file (kafka as a source) yields thousands of "in progress" files

2018-12-10 Thread Avi Levi
) .build() On Sun, Dec 9, 2018 at 2:13 PM Avi Levi wrote: > Hi, > I am trying to read from kafka and write to parquet. But I am getting > thousands of ".part-0-0in progress..." files (and counting ...) > is that a bug or am I doing something wrong? > > obje

getting an error when configuring state backend to hdfs

2018-12-19 Thread Avi Levi
Hi, I am trying to set the backend state to hdfs *val stateUri = "hdfs/path_to_dir"* *val backend: RocksDBStateBackend = new RocksDBStateBackend(stateUri, true)* *env.setStateBackend(backend)* I am running with flink 1.7.0 with the following dependencies (tried them with different combinations)

Re: getting an error when configuring state backend to hdfs

2018-12-19 Thread Avi Levi
in the /lib directory of the flink distribution. > > On 19.12.2018 15:03, Avi Levi wrote: > > Hi, > I am trying to set the backend state to hdfs > *val stateUri = "hdfs/path_to_dir"* > *val backend: RocksDBStateBackend = new RocksDBStateBackend(stateUri, > true)* >

Re: getting an error when configuring state backend to hdfs

2018-12-19 Thread Avi Levi
ortedSchemeFactory.java:64) at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:399) ... 23 more On Wed, Dec 19, 2018 at 6:50 PM Steven Nelson wrote: > What image are you using? > > Sent from my iPhone > > On Dec 19, 2018, at 9:44 AM, Avi Levi wrote: > > Hi Chesnay, &

Connecting to kafka with tls

2018-12-23 Thread Avi Levi
Hi, Can anyone give me an example for how to use kafka client with tls support ? I must use tls to connect to our kafka Thanks in advance Avi

Re: getting an error when configuring state backend to hdfs

2018-12-24 Thread Avi Levi
[1] > > [1] https://flink.apache.org/downloads.html#latest-stable-release-v171 > > > Best > Yun Tang > -- > *From:* Avi Levi > *Sent:* Thursday, December 20, 2018 2:11 > *To:* Steven Nelson > *Cc:* Chesnay Schepler; user@flink.apache.o

getting Timeout expired while fetching topic metadata

2018-12-24 Thread Avi Levi
Hi all, very new to flink so my apology if it seems trivial. We deployed flink on gcloud I am trying to connect to kafka but keep getting this error: *org.apache.kafka.common.errors.TimeoutException: Timeout expired while fetching topic metadata* this how my properties look like val consumerPropert

Re: getting Timeout expired while fetching topic metadata

2018-12-24 Thread Avi Levi
TOCOL_CONFIG, "SSL"); > > Thanks, > Miki > > On Mon, Dec 24, 2018 at 8:19 PM Avi Levi wrote: > >> Hi all, >> very new to flink so my apology if it seems trivial. >> We deployed flink on gcloud >> I am trying to connect to kafka but keep getting

using updating shared data

2019-01-01 Thread Avi Levi
Hi, I have a list (couple of thousands text lines) that I need to use in my map function. I read this article about broadcasting variables or using distributed cache

Re: using updating shared data

2019-01-02 Thread Avi Levi
T5FJRWFqDA3MAnB4-dRYGDQjgfQTYAocqGkRKo&m=u5UQh821Gau2wZ7S3M8IRmVpL5JxGADJaq_k7iq6sYo&s=uITdFlQPKLbqxkTux4nR21JhUpLIkS5Pdfi9D_ZSUwE&e=> > > Cheers, > Till > > On Tue, Jan 1, 2019 at 6:49 PM miki haiat wrote: > >> Im trying to understand your use case. >> What is the sour

Re: using updating shared data

2019-01-03 Thread Avi Levi
om the control stream. > > On Wed, Jan 2, 2019 at 8:42 AM Avi Levi wrote: > >> Thanks Till I will defiantly going to check it. just to make sure that I >> got you correctly. you are suggesting the the list that I want to broadcast >> will be broadcasted via control strea

Re: using updating shared data

2019-01-06 Thread Avi Levi
ays returns > Watermark.MAX_WATERMARK as the current watermark. This produces watermarks > for the control stream that will effectively be ignored. > > On Thu, Jan 3, 2019 at 9:18 PM Avi Levi wrote: > >> Thanks for the tip Elias! >> >> On Wed, Jan 2, 2019 at 9:44 PM Elias

Passing vm options

2019-01-07 Thread Avi Levi
Hi , I am trying to pass some vm options e.g bin/flink run foo-assembly-0.0.1-SNAPSHOT.jar -Dflink.stateDir=file:///tmp/ -Dkafka.bootstrap.servers="localhost:9092" -Dkafka.security.ssl.enabled=false but it doesn't seem to override the values in application.conf . Am I missing something? BTW is it p

Re: Passing vm options

2019-01-08 Thread Avi Levi
am afraid it is not possible. > > Best Regards, > Dom. > > > pon., 7 sty 2019 o 09:01 Avi Levi napisał(a): > >> Hi , >> I am trying to pass some vm options e.g >> bin/flink run foo-assembly-0.0.1-SNAPSHOT.jar >> -Dflink.stateDir=file:///tmp/ -Dkafka.bootst

Getting RemoteTransportException

2019-01-16 Thread Avi Levi
Hi Guys, We done some load tests and we got the exception below, I saw that the JobManager was restarted, If I understood correctly, it will get new job id and the state will lost - is that correct? how the state is handled setting HA as described here

Re: Getting RemoteTransportException, HA mode

2019-01-20 Thread Avi Levi
; It is the size of the thread pool that will be used by the netty server. >> The default value is -1, which will result in the thread pool with size >> equal to the number of task slots for your JobManager. >> >> Best Regards, >> Dom. >> >> czw., 17 sty 2019 o

getting duplicate messages from duplicate jobs

2019-01-23 Thread Avi Levi
Hi, This quite confusing. I submitted the same stateless job twice (actually I upload it once). However when I place a message on kafka, it seems that both jobs consumes it, and publish the same result (we publish the result to other kafka topic, so I actually see the massage duplicated on kafka ).

Re: getting duplicate messages from duplicate jobs

2019-01-30 Thread Avi Levi
> and committing offsets back to Kafka (only for exposure purposes, not used > for processing guarantees). > The Flink Kafka Consumer uses static partition assignment on the > KafkaConsumer API, and not consumer group-based automatic partition > assignments. > > Cheers, &g

Production readiness

2019-02-13 Thread Avi Levi
Hi Looking at the production readiness checklist - is there any rule of thumb to determine the maximum parallelism ? we have a stateful pipeline with high throughput (

Reading messages from start - new job submission

2019-02-17 Thread Avi Levi
I'm updating a job without savepoint. The consumer properties is set to * prop.setProperty("auto.offset.reset", "earliest")* The start strategy is not explicitly set (using the default setStartFromGroupOffsets). In this case I expect that the consumer will read the messages from the beginning sinc

estimate number of keys on rocks db

2019-03-10 Thread Avi Levi
Hi, I am trying to estimate number of keys at a given minute. I created a graph based on avg_over_time with 1hr and 5m interval. looking at the graph you can see that it has high spikes which doesn't make sense

Re: estimate number of keys on rocks db

2019-03-10 Thread Avi Levi
cannot see the attached images. By the way, did you ever > use window in this job? > > Best > Yun Tang > ------ > *From:* Avi Levi > *Sent:* Sunday, March 10, 2019 19:41 > *To:* user > *Subject:* estimate number of keys on rocks db > > Hi

Random forest - Flink ML

2019-03-11 Thread Avi Levi
HI , According to Tills comment I understand that flink-ml is going to be ditched. What will be the alternative ? Looking for a "rando

Re: Random forest - Flink ML

2019-03-12 Thread Avi Levi
oject.eu/apache-flink/ > > Il Lun 11 Mar 2019, 15:40 Avi Levi ha scritto: > >> HI , >> According to Tills comment >> <https://issues.apache.org/jira/browse/FLINK-1728?focusedCommentId=16780468&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#c

ingesting time for TimeCharacteristic.IngestionTime on unit test

2019-03-19 Thread Avi Levi
Hi, Our stream is not based on time sequence and we do not use time based operations. we do want to clean the state after x days hence we fire timer event. My problem is that our unit test fires the event immediately (there is no ingestion time) how can I inject ingestion time ? Cheers Avi

Re: ingesting time for TimeCharacteristic.IngestionTime on unit test

2019-03-23 Thread Avi Levi
Any idea what should I do to overcome this? On Wed, Mar 20, 2019 at 7:17 PM Avi Levi wrote: > Hi Andrey, > I am testing a Filter operator that receives a key from the stream and > checks if it is a new one or not. if it is new it keeps it in state and > fire a timer all that is do

Re: ingesting time for TimeCharacteristic.IngestionTime on unit test

2019-03-25 Thread Avi Levi
ld > also consider setting event time timers but keep in mind then your > Watermark emission interval. > > In addition, if you want to simply check processing time processing of you > operator (not the whole pipeline), then you could make use of the > OneInputStreamTaskTestHarness or its keyed va

questions regarding offset

2019-03-27 Thread Avi Levi
Hi Guys, I understood that offset is kept as part of the checkpoint and persisted in the state (please correct me if I'm wrong) 1. If I copy my persisted state to another cluster (different kafka servers as well) how is the offset handled ? 2. In a stateless job how is the offset managed ? since t

Re: questions regarding offset

2019-03-28 Thread Avi Levi
onfiguration > > [2] > > https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/connectors/kafka.html#kafka-consumers-start-position-configuration > > > On 28/03/2019 06:51, Avi Levi wrote: > > Hi Guys, > > I understood that offset is kept as part of the chec

AskTimeoutException - Cannot deploy task

2019-03-28 Thread Avi Levi
Hi, I see the following exceptions, will really appreciate any help on that Thanks Avi This is the first one (out of three) : java.lang.Exception: Cannot deploy task KeyedProcess -> Sink: Unnamed (3/100) (2c9646634afe1488659da404e92697b0) - TaskManager (container_e03_1553795623823_0001_01_000

Re: AskTimeoutException - Cannot deploy task

2019-03-28 Thread Avi Levi
nce(KafkaConsumer.java:1096) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1043) at org.apache.flink.streaming.connectors.kafka.internal.KafkaConsumerThread.run(KafkaConsumerThread.java:257) -- Forwarded message - From: Avi Levi Date: Thu, Mar 28, 2

How to see user configuration in UI dashboard

2019-04-07 Thread Avi Levi
Is it possible to (or how to) set custom parameters programmatically that can be viewed in configuration tab (UI) either executions config / user configuration ?

Adding metadata to the jar

2019-04-08 Thread Avi Levi
Is there a way to add some metadata to the jar and see it on dashboard ? I couldn't find a way to do so but I think it very useful. Consider that you want to know which version is actually running in the job manager (not just which jar is uploaded which is not necessary being running at the moment

Re: Adding metadata to the jar

2019-04-09 Thread Avi Levi
>> val buildInfo = new Configuration() >> buildInfo.setString("version", "0.1.0") >> >> >> val env = StreamExecutionEnvironment.*getExecutionEnvironment >> *env.getConfig.setGlobalJobParameters(buildInfo) >> ... >> >> It helps us to have a con

Re: Getting JobExecutionException: Could not set up JobManager when trying to upload new version

2019-04-23 Thread Avi Levi
Might be useful for someone, Regarding this issue. it seems that changing the uid of the operator made this mess . On Tue, Apr 16, 2019 at 6:31 PM Avi Levi wrote: > I am trying to upload a new version of the code but I am getting the > exception below. The schema of the state was not c

Emitting current state to a sink

2019-04-25 Thread Avi Levi
Hi, We have a keyed pipeline with persisted state. Is there a way to broadcast a command and collect all values that persisted in the state ? The end result can be for example sending a fetch command to all operators and emitting the results to some sink why do we need it ? from time to time we

Re: Emitting current state to a sink

2019-04-26 Thread Avi Levi
artisans/flinktraining/solutions/datastream_java/broadcast/TaxiQuerySolution.java > <https://github.com/ververica/flink-training-exercises/blob/master/src/main/java/com/dataartisans/flinktraining/solutions/datastream_java/broadcast/TaxiQuerySolution.java> > > Am 26.04.19 um 07

Re: Emitting current state to a sink

2019-04-29 Thread Avi Levi
t; Maybe you can pass the collector to the KeyedStateFunction and emit > records while it iterates over the key space. > > Best, Fabian > > Am Fr., 26. Apr. 2019 um 17:35 Uhr schrieb Avi Levi < > avi.l...@bluevoyant.com>: > >> Hi Timo, >> I defiantly did. but br

Re: Emitting current state to a sink

2019-04-30 Thread Avi Levi
; > fhue...@gmail.com> wrote: > > > Nice! > Thanks for the confirmation :-) > > Am Mo., 29. Apr. 2019 um 13:21 Uhr schrieb Avi Levi < > avi.l...@bluevoyant.com>: > > Thanks! Works like a charm :) > > On Mon, Apr 29, 2019 at 12:11 PM Fabian Hueske wrote:

Getting async function call terminated with an exception

2019-05-07 Thread Avi Levi
Hi, We are using flink 1.8.0 (but the flowing also happens in 1.7.2) I tried very simple unordered async call override def asyncInvoke(input: Foo, resultFuture: ResultFuture[ScoredFoo]) : Unit = { val r = ScoredFoo(Foo("a"), 80) Future.successful(r) } Running this stream seem to be stuck

Re: Getting async function call terminated with an exception

2019-05-12 Thread Avi Levi
tFuture.complete(r). > > Cheers, > Till > > On Tue, May 7, 2019 at 8:30 PM Avi Levi wrote: > >> Hi, >> We are using flink 1.8.0 (but the flowing also happens in 1.7.2) I tried >> very simple unordered async call >> override def asyncInvoke(input: Foo, resul

Connection refused while trying to query state

2019-06-30 Thread Avi Levi
Hi, I am trying to query state (cluster 1.8.0 is running on my local machine) . I do see in the logs "Started the Queryable State Proxy Server @ ...". but when I am trying to query the state from the client , val descriptor = new ValueStateDescriptor("queryable-state", Types.CASE_CLASS[State]) cli

Setting consumer offset

2019-07-02 Thread Avi Levi
Hi, If I set in code the consumer offset e.g *consumer.setStartFromTimestamp* and I start the job from a curtain savepoint/checkpoint will the offset in the checkpoint will override the the offset that is defined in the code ? Best Regards Avi

Re: Connection refused while trying to query state

2019-07-02 Thread Avi Levi
o you point the client to the correct address? This means where the > "Queryable State Proxy Server @ ..." says? > > Cheers, > Kostas > > On Sun, Jun 30, 2019 at 4:37 PM Avi Levi wrote: > >> Hi, >> I am trying to query state (cluster 1.8.0 is running on m

Queryable state and TTL

2019-07-03 Thread Avi Levi
Hi, Adding queryable state to state with ttl is not supported at 1.8.0 (throwing java.lang.IllegalArgumentException: Queryable state is currently not supported with TTL) I saw in previous mailing thread

Re: Queryable state and TTL

2019-07-03 Thread Avi Levi
he road map but I am not aware about plans of any contributor >> to work on it for the next releases. >> I think the community will firstly work on the event time support for TTL. >> I will loop Yu in, maybe he has some plans to work on TTL for the >> queryable state. >>

Re: Queryable state and TTL

2019-07-06 Thread Avi Levi
get the impression that the queryable state feature is > used very often.Feel free to take it up, if you like. > https://github.com/apache/flink/pull/6626 > <https://github.com/apache/flink/pull/6626> > > -Eron > > On Wed, Jul 3, 2019 at 11:21 PM Avi Levi wrote: > >>

State incompatible

2019-07-14 Thread Avi Levi
Hi, I added a ttl to my state *old version :* private lazy val stateDescriptor = new ValueStateDescriptor("foo", Types.CASE_CLASS[DomainState]) *vs the new version * @transient private lazy val storeTtl = StateTtlConfig.newBuilder(90) .setUpdateType(StateTtlConfig.UpdateType.OnCreateAndWrite)

Re: State incompatible

2019-07-15 Thread Avi Levi
Thanks Haibo, bummer ;) On Mon, Jul 15, 2019 at 12:27 PM Haibo Sun wrote: > *This Message originated outside your organization.* > -- > Hi, Avi Levi > > I don't think there's any way to solve this problem right now, and Flink > documenta