Re: Processing events based on weights

2019-12-17 Thread Vijay Srinivasaraghavan
till the batch size is reached) Appreciate your inputs. ThanksVijayOn Monday, December 16, 2019, 08:20:31 PM PST, Vijay Srinivasaraghavan wrote: Hello, I would like to understand options available to design an ingestion pipeline to support the following requirements. 1) Events are coming

Processing events based on weights

2019-12-16 Thread Vijay Srinivasaraghavan
Hello, I would like to understand options available to design an ingestion pipeline to support the following requirements. 1) Events are coming from various sources and depending on the type of the events it will be stored in specific Kafka topics (say we have 4 topics) 2) The events that are par

stream processing and bulk updates to multiple sinks

2019-11-09 Thread Vijay Srinivasaraghavan
Hello, I have a need to process events in near real-time that are generated from various upstream sources and are currently stored in Kafka. I want to build a pipeline that reads the data as a continuous stream, enrich the events and finally store it in both ClickHouse and Kafka sinks.  To get a

Re: Mongo Connector

2019-10-16 Thread Vijay Srinivasaraghavan
Hi Jingsong Lee, Thanks for the details. Were you able to achieve end-to-end exactly once support with Mongo?  Also, for doing any intermittent reads from Mongo (Kafka -> process event -> lookup Mongo -> enhance event -> Sink to Mongo), I am thinking of using Async IO (https://ci.apache.org/pr

Mongo Connector

2019-10-12 Thread Vijay Srinivasaraghavan
Hello, Do we know how much of support we have for Mongo? The documentation page is pointing to a connector repo that was very old (last updated 5 years ago) and looks like that was just a sample code to showcase the integration. https://ci.apache.org/projects/flink/flink-docs-stable/dev/batch/con

Re: Flink 1.4.0-RC2 Hadoop Build

2017-12-04 Thread Vijay Srinivasaraghavan
d exclusions that were too aggressive. Best, Aljoscha > On 3. Dec 2017, at 21:58, Vijay Srinivasaraghavan > wrote: > > The issue is reproducible with 2.7.1 as well. > > My understanding is from 1.4.0 we don't include Hadoop dependencies by > default but only when we

Re: Flink 1.4.0-RC2 Hadoop Build

2017-12-03 Thread Vijay Srinivasaraghavan
ake this build error go away > > > In any case, the fact that Hadoop seriously needs the Servlet dependency to > initialize HDFS is just ridiculous. Hadoop seriously has the worst dependency > management I have ever seen. > > >> On Fri, Dec 1, 2017 at 1:41 PM, Vijay Sr

Flink 1.4.0-RC2 Hadoop Build

2017-12-01 Thread Vijay Srinivasaraghavan
Hello, I am trying to build and run Flink from 1.4.0-rc2 branch with hadoop binary 2.7.0 compatibility. Here are the steps I followed to build (I have maven 3.3.9). ===cd $FLINK_HOMEmvn clean install -DskipTests -Dhadoop.vers

Re: Support for multiple HDFS

2017-08-24 Thread Vijay Srinivasaraghavan
.com> wrote: > >> Hi, >> >> I don’t think that this is currently supported. If you see a use case for >> this (over creating different root directories for checkpoint data and >> result data) then I suggest that you open a JIRA issue with a new feature >>

Re: Support for multiple HDFS

2017-08-23 Thread Vijay Srinivasaraghavan
configured with one of these HCFS as state backend store. Hope this helps. RegardsVijay On Wednesday, August 23, 2017 11:06 AM, Ted Yu wrote: Would HDFS-6584 help with your use case ? On Wed, Aug 23, 2017 at 11:00 AM, Vijay Srinivasaraghavan < vijikar...@yahoo.com.invalid>

Support for multiple HDFS

2017-08-23 Thread Vijay Srinivasaraghavan
Hello, Is it possible for a Flink cluster to use multiple HDFS repository (HDFS-1 for managing Flink state backend, HDFS-2 for syncing results from user job)?  The scenario can be viewed in the context of running some jobs that are meant to push the results to an archive repository (cold storage)

State Backend

2017-08-03 Thread Vijay Srinivasaraghavan
Hello, I would like to know if we have any latency requirements for choosing appropriate state backend?  For example, if an HCFS implementation is used as Flink state backend (instead of stock HDFS), are there any implications that one needs to know with respect to the performance? - Frequency o

ElasticsearchSink Serialization Error

2017-04-27 Thread Vijay Srinivasaraghavan
Hello, I am seeing below error when I try to use ElasticsearchSink. It complains about serialization and looks like it is leading to "IndexRequestBuilder" implementation. I have tried the suggestion as mentioned in  http://stackoverflow.com/questions/33246864/elasticsearch-sink-seralizability  (c

Re: About the ACL of Zookeeper

2017-03-02 Thread Vijay Srinivasaraghavan
I assume ZK is configured to use Kerberos. I would check few configurations on the ZK side to make sure that Kerberos configurations are working fine.  On the ZK server side configurations, 1) authProvider is configured to use SASL Authentication provider2) ZK server is using approptiate JAAS en

Re: Reliable Distributed FS support (HCFS)

2017-02-16 Thread Vijay Srinivasaraghavan
FS) in terms of the scenarios/usecase that needs to be tested? Is there any general guidance on this?--- RegardsVijay On Wednesday, February 15, 2017 11:28 AM, Vijay Srinivasaraghavan wrote: Hello, Regarding the Filesystem abstraction support, we are planning to use a distributed file s

Reliable Distributed FS support (HCFS)

2017-02-15 Thread Vijay Srinivasaraghavan
Hello, Regarding the Filesystem abstraction support, we are planning to use a distributed file system which complies with Hadoop Compatible File System (HCFS) standard in place of standard HDFS. According to the documentation (https://ci.apache.org/projects/flink/flink-docs-release-1.3/internals

[jira] [Created] (FLINK-5593) Modify current dcos-flink implementation to use runit service

2017-01-20 Thread Vijay Srinivasaraghavan (JIRA)
Vijay Srinivasaraghavan created FLINK-5593: -- Summary: Modify current dcos-flink implementation to use runit service Key: FLINK-5593 URL: https://issues.apache.org/jira/browse/FLINK-5593

[jira] [Created] (FLINK-4950) Add support to include multiple Yarn application entries in Yarn properties file

2016-10-27 Thread Vijay Srinivasaraghavan (JIRA)
Vijay Srinivasaraghavan created FLINK-4950: -- Summary: Add support to include multiple Yarn application entries in Yarn properties file Key: FLINK-4950 URL: https://issues.apache.org/jira/browse/FLINK

[jira] [Created] (FLINK-4919) Add secure cookie support for the cluster deployed in Mesos environment

2016-10-25 Thread Vijay Srinivasaraghavan (JIRA)
Vijay Srinivasaraghavan created FLINK-4919: -- Summary: Add secure cookie support for the cluster deployed in Mesos environment Key: FLINK-4919 URL: https://issues.apache.org/jira/browse/FLINK-4919

[jira] [Created] (FLINK-4918) Add SSL support to Mesos artifact server

2016-10-25 Thread Vijay Srinivasaraghavan (JIRA)
Vijay Srinivasaraghavan created FLINK-4918: -- Summary: Add SSL support to Mesos artifact server Key: FLINK-4918 URL: https://issues.apache.org/jira/browse/FLINK-4918 Project: Flink

[jira] [Created] (FLINK-4826) Add keytab based kerberos support to run Flink in Mesos environment

2016-10-13 Thread Vijay Srinivasaraghavan (JIRA)
Vijay Srinivasaraghavan created FLINK-4826: -- Summary: Add keytab based kerberos support to run Flink in Mesos environment Key: FLINK-4826 URL: https://issues.apache.org/jira/browse/FLINK-4826

Re: S3/S3A support

2016-10-11 Thread Vijay Srinivasaraghavan
results in occasional (seldom) periods of heavy restart retries, until all files are visible to all participants. If you run into that issue, may be worthwhile to look at Flink 1.2-SNAPSHOT. Best, Stephan On Tue, Oct 11, 2016 at 12:13 AM, Vijay Srinivasaraghavan wrote: Hello, Per documentat

S3/S3A support

2016-10-10 Thread Vijay Srinivasaraghavan
Hello, Per documentation (https://ci.apache.org/projects/flink/flink-docs-master/setup/aws.html), it looks like S3/S3A FS implementation is supported using standard Hadoop S3 FS client APIs. In the absence of using standard HCFS and going with S3/S3A, 1) Are there any known limitations/issues? 

[jira] [Created] (FLINK-4667) Yarn Session CLI not listening on correct ZK namespace when HA is enabled to use ZooKeeper backend

2016-09-22 Thread Vijay Srinivasaraghavan (JIRA)
Vijay Srinivasaraghavan created FLINK-4667: -- Summary: Yarn Session CLI not listening on correct ZK namespace when HA is enabled to use ZooKeeper backend Key: FLINK-4667 URL: https://issues.apache.org

[jira] [Created] (FLINK-4637) Address Yarn proxy incompatibility with Flink Web UI when service level authorization is enabled

2016-09-19 Thread Vijay Srinivasaraghavan (JIRA)
Vijay Srinivasaraghavan created FLINK-4637: -- Summary: Address Yarn proxy incompatibility with Flink Web UI when service level authorization is enabled Key: FLINK-4637 URL: https://issues.apache.org/jira

[jira] [Created] (FLINK-4635) Implement Data Transfer Authentication using shared secret configuration

2016-09-19 Thread Vijay Srinivasaraghavan (JIRA)
Vijay Srinivasaraghavan created FLINK-4635: -- Summary: Implement Data Transfer Authentication using shared secret configuration Key: FLINK-4635 URL: https://issues.apache.org/jira/browse/FLINK-4635

Re: Yarn Test Timeout

2016-09-05 Thread Vijay Srinivasaraghavan
ne NodeManager can host multiple containers. The test only seems to reliably fail when the it's executed exclusively where a dedicated Yarn mini cluster is started only for the test. The problem is fixed on the latest master. Thanks, Max On Sun, Sep 4, 2016 at 11:21 PM, Vijay Srinivasaragh

Re: Yarn Test Timeout

2016-09-04 Thread Vijay Srinivasaraghavan
Hi Robert, Yes, I am consistently seeing the issue. Here is the stack track for the run. > mvn test integration-test > -Dtest="YARNSessionCapacitySchedulerITCase#testTaskManagerFailure" > -Pinclude-yarn-tests -pl flink-yarn-tests -DfailIfNoTests=false [INFO] Scanning for projects...[INFO]      

Yarn Test Timeout

2016-09-02 Thread Vijay Srinivasaraghavan
Hello, I am seeing a "timeout" issue for one of the Yarn test case (YarnSessionCapacitySchedulerITCase>YarnTestBase.checkClusterEmpty) and noticed similar references in FLINK-2213 (https://github.com/apache/flink/pull/1588) I have tested in latest mater code. Is anyone seeing this issue? RegardsV

Re: IT Test

2016-06-29 Thread Vijay Srinivasaraghavan
Yarn tests. If you want to fix that, you could open a pull request. Cheers, Max On Wed, Jun 29, 2016 at 1:34 AM, Vijay Srinivasaraghavan wrote: > Hi Max, > You are right. The uber jar is built in "intsall" or "package" stage and Yarn > test assumes the flink uber jar

Re: IT Test

2016-06-28 Thread Vijay Srinivasaraghavan
:343 Flink >uberjar not found Your test run fails because it can't find the Flink jar. Most likely, you have not run 'mvn clean install' before running the Yarn tests. You need to 'package' or 'install' at least flink-yarn in order to execute flink-yarn-tests. Chee

Re: IT Test

2016-06-27 Thread Vijay Srinivasaraghavan
xtended command from above? Cheers,Aljoscha On Mon, 27 Jun 2016 at 14:57 Vijay Srinivasaraghavan wrote: I am on Ubuntu 16.x, Java OpenJDK 1.8.0_91. Can you try below commands and see if its working with the latest trunk code. mvn clean verify -pl flink-yarn-tests -Pinclude-yarn-tests -Dtest=

Re: IT Test

2016-06-27 Thread Vijay Srinivasaraghavan
, Aljoscha Krettek wrote: Hi,I just ran a "mvn clean verify" and it passed on my machine (latest master, OS X El Capitan, Java 1.8.0_40, Maven 3.3.9). What's your environment? Cheers,Aljoscha On Fri, 24 Jun 2016 at 16:47 Vijay Srinivasaraghavan wrote: I am seeing below failu

IT Test

2016-06-24 Thread Vijay Srinivasaraghavan
I am seeing below failure consistently with the latest trunk code when I run "mvn clean verify". Is anyone seeing similar error in your environment? Failed tests:   LocalFlinkMiniClusterITCase.testLocalFlinkMiniClusterWithMultipleTaskManagers:166 Thread Thread[initialSeedUniquifierGenerator,5,ma

Re: Shading Plugin + Hadoop Dependency

2016-06-23 Thread Vijay Srinivasaraghavan
inikdc/MiniKdc$1.class org/apache/hadoop/minikdc/MiniKdc.class META-INF/maven/org.apache.hadoop/hadoop-minikdc/ META-INF/maven/org.apache.hadoop/hadoop-minikdc/pom.xml META-INF/maven/org.apache.hadoop/hadoop-minikdc/pom.properties On Tue, Jun 21, 2016 at 5:47 PM, Vijay Srinivasaraghavan < vijika

Re: Shading Plugin + Hadoop Dependency

2016-06-21 Thread Vijay Srinivasaraghavan
t)-[master] %      Is the same true on your machine? On Tue, Jun 21, 2016 at 5:36 PM, Vijay Srinivasaraghavan wrote: Hi Rob, You need to include below lines to the pom.xml to resolve the chain dependency error.   org.apache.felix maven-bundle-plugin true true RegardsVijay     On Tuesda

Re: Shading Plugin + Hadoop Dependency

2016-06-21 Thread Vijay Srinivasaraghavan
local repository, resolution will not be reattempted until the update interval of central has elapsed or updates are forced -> [Help 1][ERROR]  How did you do it? Regards,Robert On Tue, Jun 21, 2016 at 5:06 PM, Vijay Srinivasaraghavan wrote: Hi Rob, Yes I checked but the result jar does not

Re: Shading Plugin + Hadoop Dependency

2016-06-21 Thread Vijay Srinivasaraghavan
ntains the required classes? Regards,Robert On Mon, Jun 20, 2016 at 6:18 PM, Vijay Srinivasaraghavan wrote: Hello, I was trying to include "hadoop-minikdc" component to Yarn test framework by adding the dependency in "flink-shaded-include-yarn-tests" pom.xml file. org.ap

Shading Plugin + Hadoop Dependency

2016-06-20 Thread Vijay Srinivasaraghavan
Hello, I was trying to include "hadoop-minikdc" component to Yarn test framework by adding the dependency in "flink-shaded-include-yarn-tests" pom.xml file. org.apache.hadoop hadoop-minikdc ${hadoop.version} The dependency inclusion seems to be working from IDE. IntelliJ picked up the

Yarn Integration Test

2016-06-14 Thread Vijay Srinivasaraghavan
Hello, I am trying to run "Yarn integration test" but could not find right way to execute. I ran "mvn -pl flink-yarn-tests integration-test" and got an error message "Could not find the selected project in the reactor: flink-yarn-tests". Running "mvn -pl flink-yarn integration-test" did not fail

Develop/Test within IDE

2016-05-21 Thread Vijay Srinivasaraghavan
Folks, I am making some code changes to "flink-runtime" module. I am able to test/debug my changes from IDE (IntelliJ) using remote debug option (test Job Manager/Task Manager runtime/startup). Steps followed...1) git clone flink... 2) Import Flink maven project from IntelliJ3) Made some code ch

Re: remote debugging

2016-05-16 Thread Vijay Srinivasaraghavan
h the script you tampered with and run the debugger. Please note that if you set "suspend=y" Flink won't start until the debugger is attached to the process. Also beware that if the machine running Flink is far away from the remote debugger you may suffer from increased latency whe

remote debugging

2016-05-16 Thread Vijay Srinivasaraghavan
How do I attach remote debugger to running Flink cluster from IntelliJ? Appreciate if anyone could share the steps? RegardsVijay

Re: RollingSink

2016-03-23 Thread Vijay Srinivasaraghavan
gt;>> On Mar 23, 2016, at 5:39 AM, Aljoscha Krettek >>>>>> wrote: >>>>>> >>>>>> Ok, then you should be able to change the log level to DEBUG in >>>>>> conf/log4j.properties. >>>>>> >>>>>>

Re: RollingSink

2016-03-22 Thread Vijay Srinivasaraghavan
ilable write a .valid-length file) 3. Move pending files to final location that where part of the checkpoint 4. cleanup any leftover pending/in-progress files Cheers, Aljoscha > On 22 Mar 2016, at 10:08, Vijay Srinivasaraghavan > wrote: > > Hello, > I have enabled checkpoint and I am

RollingSink

2016-03-22 Thread Vijay Srinivasaraghavan
Hello, I have enabled checkpoint and I am using RollingSink to sink the data to HDFS (2.7.x) from KafkaConsumer. To simulate failover/recovery, I stopped TaskManager and the job gets rescheduled to other Taskmanager instance. During this momemnt, the current "in-progress" gets closed and renamed

YARN/Flink Job

2016-03-19 Thread Vijay Srinivasaraghavan
If I start a flink job on YARN with below option, does Flink (JM & TM) service gets killed after the job execution is complete? In otherwords, what is the lifetime of the Flink service after the job is complete?   Run a single Flink job on YARN The documentation above describes how to start a Fl