Re: Error while connecting with MSSQL server

2020-12-09 Thread aj
or-jdbc doesn't support MS Server dialect. Only >> MySQL and Postgres are supported. >> >> Best, >> Jark >> >> On Tue, 8 Dec 2020 at 01:20, aj wrote: >> >>> Hello , >>> >>> I am trying to create a table with microsoft sql serve

Stream job getting Failed

2020-12-09 Thread aj
I have a Flink stream job that reads data from Kafka and writes it to S3. This job keeps failing after running for 2-3 days. I am not able to find anything in logs why it's failing. Can somebody help me how to find out the cause of failure? I can only see this in logs : org.apache.flink.streamin

Error while connecting with MSSQL server

2020-12-07 Thread aj
Hello , I am trying to create a table with microsoft sql server using flink sql CREATE TABLE sampleSQLSink ( id INTEGER message STRING, ts TIMESTAMP(3), proctime AS PROCTIME() ) WITH ( 'connector' = 'jdbc', 'driver' = 'com.microsoft.sqlserver.jdbc.SQLServerDriver', 'u

Re: Broadcasting control messages to a sink

2020-12-04 Thread aj
Hi Jafee, Can u please help me out with the sample code how you have written the custom sink and how you using this broadcast pattern to update schema at run time. It will help me. On Sat, Oct 17, 2020 at 1:55 PM Piotr Nowojski wrote: > Hi Julian, > > Glad to hear it worked! And thanks for comi

Re: Broadcasting control messages to a sink

2020-10-15 Thread aj
Hi Jaffe, I am also working on something similar type of a problem. I am receiving a set of events in Avro format on different topics. I want to consume these and write to s3 in parquet format. I have written a job that creates a different stream for each event and fetches its schema from the co

Flink Collector issue when Collection Object

2020-09-28 Thread aj
Hello All, Can somebody help me to resolve this and understand what is wrong i am doing. https://stackoverflow.com/questions/64063833/flink-collector-issue-when-collection-object-with-map-of-object-class -- Thanks & Regards, Anuj Jain

Re: Flink 1.11 TaskManagerLogFileHandler -Exception

2020-09-04 Thread aj
I have also checked for port and all the ports from 0-65535 are open. Even I do not see any taskmanager.log is getting generated under my container logs on the task machine. On Fri, Sep 4, 2020 at 2:58 PM aj wrote: > > I am trying to switch to Flink 1.11 with the new EMR release 6.1.

Flink 1.11 TaskManagerLogFileHandler -Exception

2020-09-04 Thread aj
I am trying to switch to Flink 1.11 with the new EMR release 6.1. I have created 3 nodes EMR cluster with Flink 1.11. When I am running my job its working fine only issue is I am not able to see any logs in the job manager and task manager. I am seeing below exception in stdout of job manager 09:2

Re: Count of records coming into the Flink App from KDS and at each step through Execution Graph

2020-09-03 Thread aj
Hello Vijay, I have the same use case where I am reading from Kafka and want to report count corresponding to each event every 5 mins. On Prometheus, I want to set an alert if fr any event we do not receive the event like say count is zero. So can you please help me with how you implemented this f

Not able to Assign Watermark in Flink 1.11

2020-08-27 Thread aj
I am getting this error when trying to assign watermark in Flink 1.11 *"Cannot resolve method 'withTimestampAssigner(anonymous org.apache.flink.api.common.eventtime.SerializableTimestampAssigner)'"* FlinkKafkaConsumer bookingFlowConsumer = new FlinkKafkaConsumer(topics, new KafkaGenericAvroD

Stream job with Event Timestamp (Backfill getting incorrect results)

2020-08-20 Thread aj
I have a streaming job where I am doing window operation on *"user_id" *and then doing some summarization based on some time bases logic like : 1. end the session based on 30 mins inactivity of the user. 2. The End_trip event or cancellation event has arrived for the user. I am trying to rerun

Re: Flink Stream job to parquet sink

2020-06-29 Thread aj
t state gets > prefixes (eventName), but it should be rather straightforward. > > On Thu, Jun 25, 2020 at 3:47 PM aj wrote: > >> Thanks, Arvide for detailed answers. >> - Have some kind submitter that restarts flink automatically on config >> change (assumes that re

Re: Flink Stream job to parquet sink

2020-06-25 Thread aj
nal sinks. Then you >> could simply add a new sink whenever a new event occurs. >> >> The first option is the easiest and the last option the most versatile >> (could even have different sink types mixed). >> >> On Tue, Jun 23, 2020 at 5:34 AM aj wrote: >> &

Re: Flink Stream job to parquet sink

2020-06-22 Thread aj
I am stuck on this . Please give some suggestions. On Tue, Jun 9, 2020, 21:40 aj wrote: > please help with this. Any suggestions. > > On Sat, Jun 6, 2020 at 12:20 PM aj wrote: > >> Hello All, >> >> I am receiving a set of events in Avro format on different topi

Flink Count of Events using metric

2020-06-16 Thread aj
Please help me with this: https://stackoverflow.com/questions/62297467/flink-count-of-events-using-metric I have a topic in Kafka where I am getting multiple types of events in JSON format. I have created a file stream sink to write these events to S3 with bucketing. Now I want to publish an hou

Re: Flink on yarn : yarn-session understanding

2020-06-16 Thread aj
1:03 PM Andrey Zagrebin wrote: > Hi Anuj, > > Afaik, the REST API should work for both modes. What is the issue? Maybe, > some network problem to connect to YARN application master? > > Best, > Andrey > > On Mon, Jun 8, 2020 at 4:39 PM aj wrote: > >> I

Re: Flink Stream job to parquet sink

2020-06-09 Thread aj
please help with this. Any suggestions. On Sat, Jun 6, 2020 at 12:20 PM aj wrote: > Hello All, > > I am receiving a set of events in Avro format on different topics. I want > to consume these and write to s3 in parquet format. > I have written a below job that creates a diffe

Re: Data Quality Library in Flink

2020-06-09 Thread aj
link/flink-docs-release-1.10/monitoring/metrics.html > [3] > https://ci.apache.org/projects/flink/flink-docs-release-1.10/monitoring/metrics.html#prometheus-orgapacheflinkmetricsprometheusprometheusreporter > [4] https://prometheus.io/docs/alerting/overview/ > > On Sat, Jun 6, 2020 at 9:2

Flink on yarn : yarn-session understanding

2020-06-08 Thread aj
I am running some stream jobs that are long-running always. I am currently submitting each job as a standalone job on yarn. 1. I need to understand what is the advantage of using yarn-session and when should I use that. 2. Also, I am not able to access rest API services is it because I am running

Flink Stream job to parquet sink

2020-06-05 Thread aj
Hello All, I am receiving a set of events in Avro format on different topics. I want to consume these and write to s3 in parquet format. I have written a below job that creates a different stream for each event and fetches it schema from the confluent schema registry to create a parquet sink for a

Data Quality Library in Flink

2020-06-05 Thread aj
Hello All, I want to do some data quality analysis on stream data example. 1. Fill rate in a particular column 2. How many events are going to error queue due to favor schema validation failed? 3. Different statistics measure of a column. 3. Alert if a particular threshold is breached (like if f

Re: Re: Re: Flink Window with multiple trigger condition

2020-05-30 Thread aj
t;[2] > https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/operators/process_function.html#example > > > --Original Mail -- > *Sender:*aj > *Send Date:*Fri May 29 02:07:33 2020 > *Recipients:*Yun Gao > *CC:*user > *Su

Re: Flink Elastic Sink

2020-05-30 Thread aj
Thanks, It worked. I was confused before as I was thinking the sink builder is called only once but it gets called for every batch request, correct me if my understanding is wrong. On Fri, May 29, 2020 at 9:08 AM Leonard Xu wrote: > Hi,aj > > In the implementation of Elastics

Flink Elastic Sink

2020-05-28 Thread aj
Hello All, I am getting many events in Kafka and I have written a link job that sinks that Avro records from Kafka to S3 in parquet format. Now, I want to sink these records into elastic search. but the only challenge is that I want to sink record on time indices. Basically, In Elastic, I want to

Re: Re: Flink Window with multiple trigger condition

2020-05-28 Thread aj
e timer if it is triggered by " > *start*" event. A simpler case is [1] and it does not consider stop the > aggreation when received special event, but it seems that the logic could > be added to the case. > > [1] > https://ci.apache.org/projects/flink/flink-docs-release-1.10/de

Re: Flink Window with multiple trigger condition

2020-05-23 Thread aj
at 4:52 PM aj wrote: > > > I was also thinking to have a processing time window but that will not > work for me. I want to start the window when the user "*search*" event > arrives. So for each user window will start from the *search* event. > The Tumbling window

Re: Flink Window with multiple trigger condition

2020-05-22 Thread aj
I was also thinking to have a processing time window but that will not work for me. I want to start the window when the user "*search*" event arrives. So for each user window will start from the *search* event. The Tumbling window has fixed start end time so that will not be suitable in my case.

Re: Flink Window with multiple trigger condition

2020-05-21 Thread aj
ave implementations for such > scenarios, you may need to create your own WindowAssigner. > > For the trigger, yes, you can always implement a trigger to determine the > lifecyle of a window. > > > > [1]. > https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/oper

Flink Window with multiple trigger condition

2020-05-20 Thread aj
Hello All, I am getting a lot of user events in a stream. There are different types of events, now I want to build some aggregation metrics for the user by grouping events in buckets. My condition for windowing is : 1. Start the window for the user when event_name: *"search" *arrived for the u

Re: upgrade flink from 1.9.1 to 1.10.0 on EMR

2020-05-14 Thread aj
. Even in new EMR version(emr 6.0) flink has been removed. On Sat, May 9, 2020 at 1:36 PM aj wrote: > Hello Yang, > > I have attached my pom file and I did not see that I am using any Hadoop > dependency. Can you please help me. > > On Wed, May 6, 2020 at 1:22 PM Yang Wang

Re: upgrade flink from 1.9.1 to 1.10.0 on EMR

2020-05-09 Thread aj
Hello Yang, I have attached my pom file and I did not see that I am using any Hadoop dependency. Can you please help me. On Wed, May 6, 2020 at 1:22 PM Yang Wang wrote: > Hi aj, > > From the logs you have provided, the hadoop version is still 2.4.1. > Could you check the user jar

Re: upgrade flink from 1.9.1 to 1.10.0 on EMR

2020-05-06 Thread aj
Hello, Please help me upgrade to 1.10 in AWS EMR. On Fri, May 1, 2020 at 4:05 PM aj wrote: > Hi Yang, > > I am attaching the logs for your reference, please help me what i am doing > wrong. > > Thanks, > Anuj > > On Wed, Apr 29, 2020 at 9:06 AM Yang Wang wrote: >

Re: upgrade flink from 1.9.1 to 1.10.0 on EMR

2020-05-01 Thread aj
([^)]*)\\))?"+ > "(s/([^/]*)/([^/]*)/(g)?)?))/?(L)?"); > > > Could you share the jobmanager logs so that i could check the classpath > and hadoop version? > > Best, > Yang > > aj 于2020年4月28日周二 上午1:01写道: > >> Hello Yang, >>

Re: upgrade flink from 1.9.1 to 1.10.0 on EMR

2020-04-27 Thread aj
to download the > flink-shaded-hadoop with > version 2.8 here[1]. > > > [1]. https://flink.apache.org/downloads.html#additional-components > > Best, > Yang > > aj 于2020年4月11日周六 上午4:21写道: > >> Hi Robert, >> attached the full application log file. >&

Re: upgrade flink from 1.9.1 to 1.10.0 on EMR

2020-04-10 Thread aj
Hi Robert, attached the full application log file. Thanks, Anuj Container: container_1585301409143_0044_01_01 on ip-172-25-2-209.ap-south-1.compute.internal_8041 =

Re: flink 1.9 conflict jackson version

2020-04-06 Thread aj
Hi Fanbin, I am facing a similar kind of issue. Let me know if you are able to resolve this issue then please help me also https://stackoverflow.com/questions/61012350/flink-reading-a-s3-file-causing-jackson-dependency-issue On Tue, Dec 17, 2019 at 7:50 AM ouywl wrote: > Hi Bu >I think I

upgrade flink from 1.9.1 to 1.10.0 on EMR

2020-04-06 Thread aj
Hello All, I am running Flink on AWS EMR, as currently the latest version available on EMR is 1.9.1 but I want to upgrade to 1.10.0. I tried to manually replace lib jars by downloading the 1.10.0 version but this is not working. I am getting the following exception when trying to submit a job on y

Flink jackson conflict issue with aws-sdk dependency

2020-04-04 Thread aj
Hello, Please help me resolve this issue https://stackoverflow.com/questions/61012350/flink-reading-a-s3-file-causing-jackson-dependency-issue -- Thanks & Regards, Anuj Jain

Re: Help me understand this Exception

2020-03-18 Thread aj
That would provide more insight to exactly why your job is failing. > > Cheers, > Gordon > > On Tue, Mar 17, 2020 at 11:27 PM aj wrote: > Hi, > I am running a streaming job with generating watermark like this : > > public static class SessionAssi

Help me understand this Exception

2020-03-17 Thread aj
Hi, I am running a streaming job with generating watermark like this : public static class SessionAssigner implements AssignerWithPunctuatedWatermarks { @Override public long extractTimestamp(GenericRecord record, long previousElementTimestamp) { long timestamp = (long) record.get(

Re: Flink Session Window to enrich Event with unique id

2020-03-07 Thread aj
Please help me to implement the above logic. On Mon, Mar 2, 2020 at 4:47 PM aj wrote: > Hi, > Is using the session window to implement the above logic is good idea or i > should use process function. > > On Sun, Mar 1, 2020 at 11:39 AM aj wrote: > >> Hi , >> >&

Re: Flink EventTime Processing Watermark is always coming as 9223372036854725808

2020-03-03 Thread aj
hat we are not wasting support resources in > our community on double-debugging issues) > > On Mon, Mar 2, 2020 at 5:36 PM aj wrote: > >> Hi David, >> >> Currently, I am testing it with a single source and parallelism 1 only so >> not able to understand this be

Re: Flink EventTime Processing Watermark is always coming as 9223372036854725808

2020-03-02 Thread aj
_time.html#watermarks-in-parallel-streams > > Hope that helps > > Best, > > Dawid > On 02/03/2020 16:26, aj wrote: > > I am trying to use process function to some processing on a set of events. > I am using event time and keystream. The issue I am facing is The waterm

Flink EventTime Processing Watermark is always coming as 9223372036854725808

2020-03-02 Thread aj
I am trying to use process function to some processing on a set of events. I am using event time and keystream. The issue I am facing is The watermark value is always coming as 9223372036854725808. I have put print statement to debug and it shows like this: timestamp--1583128014000 extractedTi

Re: Flink Session Window to enrich Event with unique id

2020-03-02 Thread aj
Hi, Is using the session window to implement the above logic is good idea or i should use process function. On Sun, Mar 1, 2020 at 11:39 AM aj wrote: > Hi , > > I am working on a use case where i have a stream of events. I want to > attach a unique id to all the events happened

Flink Session Window to enrich Event with unique id

2020-02-29 Thread aj
Hi , I am working on a use case where i have a stream of events. I want to attach a unique id to all the events happened in a session. Below is the logis that i am trying to implement. - 1. session_started 2 whenevr a event_name=search generate a unique search_id and attch this id to all the foll

Re: Flink Stream Sink ParquetWriter Failing with ClassCastException

2020-02-29 Thread aj
. Consequently, I would suggest to verify that your input data has > the right format and if not to filter those records out which are > non-conformant. > > Cheers, > Till > > On Sat, Feb 29, 2020 at 2:13 PM aj wrote: > >> Hi All, >> >> i have Written a con

Flink Stream Sink ParquetWriter Failing with ClassCastException

2020-02-29 Thread aj
Hi All, i have Written a consumer that read from kafka topic and write the data in parquet format using StreamSink . But i am getting following error. Its runs for some hours than start failing with this excpetions. I tried to restart it but failing with same exceptions.After i restart with latest

Re: Map Of DataStream getting NullPointer Exception

2020-02-27 Thread aj
nreferencing the value you got from it. > > Also, the approach looks a bit strange. > Can you describe what are you trying to achieve? > > Regards, > Roman > > > On Mon, Feb 24, 2020 at 5:47 PM aj wrote: > >> >> I am trying below piece of code to

Map Of DataStream getting NullPointer Exception

2020-02-24 Thread aj
I am trying below piece of code to create multiple datastreams object and store in map. for (EventConfig eventConfig : eventTypesList) { LOGGER.info("creating a stream for ", eventConfig.getEvent_name()); String key = eventConfig.getEvent_name(); final Streaming

Re: BucketingSink capabilities for DataSet API

2020-02-19 Thread aj
atibility. > > > > You could implement this with reading from input files as stream and > > then using StreamingFileSink with a custom BucketAssigner [1]. > > The problem with that (which was not yet resolved AFAIK) is described > > here [2] in "Important

Re: BucketingSink capabilities for DataSet API

2020-02-19 Thread aj
entually, for this use-case I chose Spark to do the > job... > > [1] > https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/connectors/streamfile_sink.html > [2] > https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/connectors/streamfile_sink.html#general &g

Re: BucketingSink capabilities for DataSet API

2020-02-15 Thread aj
Hi Rafi, I have a similar use case where I want to read parquet files in the dataset and want to perform some transformation and similarly want to write the result using year month day partitioned. I am stuck at first step only where how to read and write Parquet files using hadoop-Compatability.

Re: Flink ParquetAvroWriters Sink

2020-01-28 Thread aj
I am able to resolve this issue by setting classloader.resolve-order as parent-first. On Wed, Jan 22, 2020, 23:13 aj wrote: > Hi Arvid, > > I have implemented the code with envelope schema as you suggested but now > I am facing issues with the consumer . I have written co

Re: Flink ParquetAvroWriters Sink

2020-01-23 Thread aj
t; [1], which would write GenericRecords to byte[] and vice versa. > > Note that I still recommend to just bundle the schema with your Flink > application and not reinvent the wheel. > > [1] > https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/custom_serializers.

Re: Flink ParquetAvroWriters Sink

2020-01-22 Thread aj
2/gettingstartedjava.html > [3] https://github.com/davidmc24/gradle-avro-plugin > > On Wed, Jan 22, 2020 at 6:43 PM aj wrote: > >> Hi Arvid, >> >> I have implemented the code with envelope schema as you suggested but now >> I am facing issues with the con

Re: Flink ParquetAvroWriters Sink

2020-01-22 Thread aj
ic" Please help me to resolve this issue. Thanks, Anuj On Mon, Jan 20, 2020 at 9:42 PM aj wrote: > Thanks, Arvid for all the clarification. I will work on the approach you > suggested. > > Thanks, > Anuj > > On Sat, Jan 18, 2020 at 10:17 PM Arvid Heise wrote: >

Re: Flink ParquetAvroWriters Sink

2020-01-20 Thread aj
e approach that I > outlined previously by bundling a specific schema with your application. > > If you want to extract the latest schema for a subject: > > var registryClient = new CachedSchemaRegistryClient(, 1000); > var versions = registryClient.getAllVersions(); > var schema =

Re: Flink ParquetAvroWriters Sink

2020-01-18 Thread aj
ecute everything. > > You have one project where all validations reside. But you'd have almost > no overhead to process a given source of eventType. The downside of that > approach is of course, that each new event type would require a > redeployment, but that seems like what you'

Re: Flink ParquetAvroWriters Sink

2020-01-16 Thread aj
t;: null > } > ] > } > > This envelope is evolvable (arbitrary addition/removal of wrapped types, > which by themselves can be evolved), and adds only a little overhead (1 > byte per subtype). The downside is that you cannot enforce that exactly one > of the subtypes is

Flink ParquetAvroWriters Sink

2020-01-16 Thread aj
Hi All, I have a use case where I am getting a different set of Avro records in Kafka. I am using the schema registry to store Avro schema. One topic can also have different types of records. Now I have created a GenericRecord Stream using kafkaAvroDeseralizer by defining custom Deserializer clas

Re: Flink Dataset to ParquetOutputFormat

2020-01-16 Thread aj
gt;> Best, >> Vino >> >> [1]: >> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/filesystem_sink.html#bucketing-file-sink >> >> aj 于2019年12月27日周五 上午1:51写道: >> >>> Thanks Vino. >>> >>> I am able to write data

Re: Kafka Schema registry

2020-01-14 Thread aj
ConfluentRegistryAvroDeserializationSchema.forGeneric() is require reader schema .How we can used it deseralize using writer schema. On Fri, Sep 13, 2019 at 12:04 AM Lasse Nedergaard wrote: > Hi Elias > > Thanks for letting me know. I have found it but we also need the option to > register Avro

Re: Flink Dataset to ParquetOutputFormat

2019-12-26 Thread aj
ave a look. I can not make sure whether it is helpful or not. > > [1]: https://github.com/FelixNeutatz/parquet-flinktacular > > Best, > Vino > > aj 于2019年12月21日周六 下午7:03写道: > >> Hello All, >> >> I am getting a set of events in JSON that I am dumping in the

Flink Dataset to ParquetOutputFormat

2019-12-21 Thread aj
Hello All, I am getting a set of events in JSON that I am dumping in the hourly bucket in S3. I am reading this hourly bucket and created a DataSet. I want to write this dataset as a parquet but I am not able to figure out. Can somebody help me with this? Thanks, Anuj

Re: watermark trigger doesn't check whether element's timestamp is passed

2016-11-01 Thread aj heller
7;m also not sure if discussions are ongoing. I had hoped to prototype the proposal as is, to have something more concrete to discuss. Cheers, aj On Nov 1, 2016 3:24 PM, "Manu Zhang" wrote: > Thanks. The ideal case is to fire after watermark past each element from > the window bu

Re: Merging N parallel/partitioned WindowedStreams together, one-to-one, into a global window stream

2016-10-06 Thread AJ Heller
bal window for each window id and collect all > N windows. > > Best, Fabian > > 2016-10-06 22:39 GMT+02:00 AJ Heller : > >> The goal is: >> * to split data, random-uniformly, across N nodes, >> * window the data identically on each node, >> * transf

Merging N parallel/partitioned WindowedStreams together, one-to-one, into a global window stream

2016-10-06 Thread AJ Heller
ly need to aggregate over one window from each partition at a time. Similarly for a fold. The closest I have found is ParallelMerge for ConnectedStreams, but I have not found a way to apply it to this problem. Can flink achieve this? If so, I'd greatly appreciate a point in the right direction. Cheers, -aj

Re: ResourceManager not using correct akka URI in standalone cluster (?)

2016-09-28 Thread AJ Heller
he scenario, on the off chance there's a bug in there I can help dig up. Best, aj

Re: ResourceManager not using correct akka URI in standalone cluster (?)

2016-09-15 Thread AJ Heller
work because I couldn't get the private IP scenario working. Running `./bin/start-local.sh` shows non-zero counts in the Flink Dashboard. Cluster setups show zero-counts all around. -aj On Thu, Sep 15, 2016 at 12:41 PM, AJ Heller wrote: > I'm running a standalone cluster on Amazon

ResourceManager not using correct akka URI in standalone cluster (?)

2016-09-15 Thread AJ Heller
I'm running a standalone cluster on Amazon EC2. Leader election is happening according to the logs, and the Flink Dashboard is up and running, accessible remotely. The issue I'm having is that the SocketWordCount example is not working, the local connection is being refused! In the Flink Dashboard