Re: Two Queries and a Kafka Topic

2020-08-10 Thread Austin Cawley-Edwards
Hey all, A bit late here and I’m not sure it’s totally valuable, but we have a similar job where we need to query an external data source on startup before processing the main stream as well. Instead of making that request in the Jobmanager process when building the graph, we make those requests

Re: Two Queries and a Kafka Topic

2020-08-10 Thread Austin Cawley-Edwards
for each operator that needs the *data** 🤦‍♂️ * On Mon, Aug 10, 2020 at 3:58 PM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > Hey all, > > A bit late here and I’m not sure it’s totally valuable, but we have a > similar job where we need to query an external data

Re: K8s operator for dlink

2020-08-14 Thread Austin Cawley-Edwards
Hey Narasimha, We use an operator at FinTech Studios[1] (built by me) to deploy Flink via the Ververica Platform[2]. We've been using it in production for the past 7 months with no "show-stopping" bugs, and know some others have been experimenting with bringing it to production as well. Best, Aus

Flink SQL Streaming Join Creates Duplicates

2020-08-27 Thread Austin Cawley-Edwards
Hey all, I've got a Flink 1.10 Streaming SQL job using the Blink Planner that is reading from a few CSV files and joins some records across them into a couple of data streams (yes, this could be a batch job won't get into why we chose streams unless it's relevant). These joins are producing some d

Re: Flink SQL Streaming Join Creates Duplicates

2020-08-27 Thread Austin Cawley-Edwards
data a 1", b = "data b 1", c = null) Record(a = "data a 2", b = "data b 2", c = "data c 2") Record(a = "data a 2", b = "data b 2", c = null) On Thu, Aug 27, 2020 at 3:34 PM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: &

Re: Flink SQL Streaming Join Creates Duplicates

2020-08-27 Thread Austin Cawley-Edwards
Ah, I think the "Result Updating" is what got me -- INNER joins do the job! On Thu, Aug 27, 2020 at 3:38 PM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > oops, the example query should actually be: > > SELECT table_1.a, table_1.b, table_2.c > FROM table_1

Re: Flink SQL Streaming Join Creates Duplicates

2020-08-31 Thread Austin Cawley-Edwards
could > you please update your current progress? > > Best, > > Arvid > > On Thu, Aug 27, 2020 at 11:41 PM Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > >> Ah, I think the "Result Updating" is what got me -- INNER joins do the

Re: Apache Qpid connector.

2020-09-25 Thread Austin Cawley-Edwards
Hey (Master) Parag, I don't know anything about Apache Qpid, but from the homepage[1], it looks like the protocol is just AMQP? Are there more specifics than that? If it is just AMQP would the RabbitMQ connector[2] work for you? Best, Austin [1]: https://qpid.apache.org/ [2]: https://ci.apache.o

Streaming SQL Job Switches to FINISHED before all records processed

2020-09-28 Thread Austin Cawley-Edwards
Hey all, I'm not sure if I've missed something in the docs, but I'm having a bit of trouble with a streaming SQL job that starts w/ raw SQL queries and then transitions to a more traditional streaming job. I'm on Flink 1.10 using the Blink planner, running locally with no checkpointing. The job l

Re: Streaming SQL Job Switches to FINISHED before all records processed

2020-09-29 Thread Austin Cawley-Edwards
om window trigger)? This would help us to better understand your > problem. > > I am also pulling in Klou and Timo who might help with the windowing logic > and the Table to DataStream conversion. > > Cheers, > Till > > On Mon, Sep 28, 2020 at 11:49 PM Austin Cawley-Edwards

Re: Streaming SQL Job Switches to FINISHED before all records processed

2020-09-30 Thread Austin Cawley-Edwards
Hey all, Thanks for your patience. I've got a small repo that reproduces the issue here: https://github.com/austince/flink-1.10-sql-windowing-error Not sure what I'm doing wrong but it feels silly. Thanks so much! Austin On Tue, Sep 29, 2020 at 3:48 PM Austin Cawley-Edwards &l

Re: Streaming SQL Job Switches to FINISHED before all records processed

2020-10-01 Thread Austin Cawley-Edwards
s, > Till > > On Thu, Oct 1, 2020 at 12:01 AM Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > >> Hey all, >> >> Thanks for your patience. I've got a small repo that reproduces the issue >> here: https://github.com/austince/flink-1.10-sql-w

Re: Streaming SQL Job Switches to FINISHED before all records processed

2020-10-01 Thread Austin Cawley-Edwards
conversion? I'm currently getting the `Record has Long.MIN_VALUE timestamp (= no timestamp marker)` error, though I'm seeing timestamps being assigned if I step through the AutomaticWatermarkContext. Thanks, Austin On Thu, Oct 1, 2020 at 10:52 AM Austin Cawley-Edwards < austin.caw...@gma

Re: Streaming SQL Job Switches to FINISHED before all records processed

2020-10-02 Thread Austin Cawley-Edwards
/streaming/time_attributes.html > > Cheers, > Till > > On Fri, Oct 2, 2020 at 12:10 AM Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > >> Hey Till, >> >> Just a quick question on time characteristics -- this should work for >> Ingestio

Re: Issue with Flink - unrelated executeSql causes other executeSqls to fail.

2020-10-06 Thread Austin Cawley-Edwards
Hey Dan, We use Junit5 and Bazel to run Flink SQL tests on a mini cluster and haven’t had issues, though we’re only testing on streaming jobs. Happy to help setting up logging with that if you’d like. Best, Austin On Tue, Oct 6, 2020 at 6:02 PM Dan Hill wrote: > I don't think any of the gotch

Re: Issue with Flink - unrelated executeSql causes other executeSqls to fail.

2020-10-06 Thread Austin Cawley-Edwards
20 PM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > Hey Dan, > > We use Junit5 and Bazel to run Flink SQL tests on a mini cluster and > haven’t had issues, though we’re only testing on streaming jobs. > > Happy to help setting up logging with that if you’d like.

Re: Issue with Flink - unrelated executeSql causes other executeSqls to fail.

2020-10-06 Thread Austin Cawley-Edwards
82.html On Tue, Oct 6, 2020 at 6:32 PM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > Unless it's related to this issue[1], which was w/ my JOIN and time > characteristics, though not sure that applies for batch. > > Best, > Austin > > [1]: > apache-fli

Re: Issue with Flink - unrelated executeSql causes other executeSqls to fail.

2020-10-06 Thread Austin Cawley-Edwards
> ["-Dlog4j.configuration=file:///Users/danhill/code/src/ai/promoted/logprocessor/batch/log4j.properties"], > >> ) > >> > >> # log4j.properties > >> status = error > >> name = Log4j2PropertiesConfig > >> appenders = console > >>

Re: Streaming SQL Job Switches to FINISHED before all records processed

2020-10-08 Thread Austin Cawley-Edwards
; > > > > > On 05.10.20 09:25, Till Rohrmann wrote: > >> Hi Austin, > >> > >> thanks for offering to help. First I would suggest asking Timo whether > >> this is an aspect which is still missing or whether we overlooked it. > >> Based on tha

Re: Issue with Flink - unrelated executeSql causes other executeSqls to fail.

2020-10-08 Thread Austin Cawley-Edwards
y absolute path >>> reference. However, the actual log calls are not printing to the console. >>> Only errors appear in my terminal window and the test logs. Maybe console >>> logger does not work for this junit setup. I'll see if the file version >>>

Re: Streaming SQL Job Switches to FINISHED before all records processed

2020-10-09 Thread Austin Cawley-Edwards
e MAX_WATERMARK when the > pipeline shuts down even without having a timestamp assigned in the > StreamRecord. Watermark will leave SQL also without a time attribute as > far as I know. > > Regards, > Timo > > > On 08.10.20 17:38, Austin Cawley-Edwards wrote: > > He

Un-ignored Parsing Exceptions in the CsvFormat

2020-10-16 Thread Austin Cawley-Edwards
Hey all, I'm ingesting CSV files with Flink 1.10.2 using SQL and the CSV Format[1]. Even with the `ignoreParseErrors()` set, the job fails when it encounters some types of malformed rows. The root cause is indeed a `ParseException`, so I'm wondering if there's anything more I need to do to ignore

Re: Flink Kubernetes / Helm

2020-10-16 Thread Austin Cawley-Edwards
We use the Ververica Platform and have built an operator for it here[1] :) and use Helm with it as well. Best, Austin [1]: https://github.com/fintechstudios/ververica-platform-k8s-operator On Fri, Oct 16, 2020 at 3:12 PM Dan Hill wrote: > What libraries do people use for running Flink on Kube

Re: Un-ignored Parsing Exceptions in the CsvFormat

2020-10-22 Thread Austin Cawley-Edwards
> > > On Fri, Oct 16, 2020 at 8:32 PM Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > >> Hey all, >> >> I'm ingesting CSV files with Flink 1.10.2 using SQL and the CSV >> Format[1]. >> >> Even with the `ignoreParseE

Re: Print on screen DataStream content

2020-11-23 Thread Austin Cawley-Edwards
Hey Simone, I'd suggest trying out the `DataStream#print()` function to start, but there are a few other easy-to-integrate sinks for testing that you can check out in the docs here[1] Best, Austin [1]: https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/datastream_api.html#data-sink

Re: StreamingFileSink Not Flushing All Data

2020-03-02 Thread Austin Cawley-Edwards
s flushed. > > > > I think you can also adjust that behavior with: > > > > forBulkFormat(...) > > > > .withRollingPolicy(/* your custom logic */) > > > > I also cc Kostas who should be able to correct me if I am wrong. > > > > Best, > > > >

Re: StreamingFileSink Not Flushing All Data

2020-03-03 Thread Austin Cawley-Edwards
ere's a FLIP >> <https://cwiki.apache.org/confluence/display/FLINK/FLIP-46%3A+Graceful+Shutdown+Handling+by+UDFs> >> and >> an issue <https://issues.apache.org/jira/browse/FLINK-13103> about >> fixing this. I'm not sure what's the status though,

Re: StreamingFileSink Not Flushing All Data

2020-03-04 Thread Austin Cawley-Edwards
ed. > > Cheers, > Kostas > > [1] https://issues.apache.org/jira/browse/FLINK-13634 > > > On Tue, Mar 3, 2020 at 11:13 PM Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > >> Hi all, >> >> Thanks for the docs pointer/ FLIP Rafi, and th

Re: Single stream, two sinks

2020-03-05 Thread Austin Cawley-Edwards
We have the same setup and it works quite well. One thing to take into account is that your HTTP call may happen multiple times if you’re using checkpointing/ fault tolerance mechanism, so it’s important that those calls are idempotent and won’t duplicate data. Also we’ve found that it’s important

Re: Does anyone have an example of Bazel working with Flink?

2020-06-11 Thread Austin Cawley-Edwards
Hey all, Adding to Aaron's response, we use Bazel to build our Flink apps. We've open-sourced some of our setup here[1] though a bit outdated. There are definitely rough edges/ probably needs a good deal of work to fit other setups. We have written a wrapper around the `java_library` and `java_bin

Re: Does anyone have an example of Bazel working with Flink?

2020-06-18 Thread Austin Cawley-Edwards
ave to add a MANIFEST.MF >> file to your jar under META-INF and this file needs to contain `Main-Class: >> org.a.b.Foobar`. >> >> Cheers, >> Till >> >> On Fri, Jun 12, 2020 at 12:30 AM Austin Cawley-Edwards < >> austin.caw...@gmail.com> wrote: &

Re: Does anyone have an example of Bazel working with Flink?

2020-06-18 Thread Austin Cawley-Edwards
a` and `rules_java` are enough for us at > this point. > > It's entirely possible I'm not thinking far enough into the future, > though, so don't take our lack of investment as a sign it's not worth > investing in :) > > Best, > > Aaron Levin > > On Thu

Decompressing Tar Files for Batch Processing

2020-07-06 Thread Austin Cawley-Edwards
Hey all, I need to ingest a tar file containing ~1GB of data in around 10 CSVs. The data is fairly connected and needs some cleaning, which I'd like to do with the Batch Table API + SQL (but have never used before). I've got a small prototype loading the uncompressed CSVs and applying the necessar

Re: Decompressing Tar Files for Batch Processing

2020-07-07 Thread Austin Cawley-Edwards
me supported filesystem. Decompressing the file is supported for > selected formats (deflate, gzip, bz2, xz), but this seems to be an > undocumented feature, so I'm not sure how usable it is in reality. > > On 07/07/2020 01:30, Austin Cawley-Edwards wrote: > > Hey all,

Re: Decompressing Tar Files for Batch Processing

2020-07-07 Thread Austin Cawley-Edwards
On Tue, Jul 7, 2020 at 10:53 AM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > Hey Xiaolong, > > Thanks for the suggestions. Just to make sure I understand, are you saying > to run the download and decompression in the Job Manager before executing > the job? >

Re: Flink - Pod Identity

2021-04-03 Thread Austin Cawley-Edwards
Hi Swagat, I’ve used kube2iam[1] for granting AWS access to Flink pods in the past with good results. It’s all based on mapping pod annotations to AWS IAM roles. Is this something that might work for you? Best, Austin [1]: https://github.com/jtblin/kube2iam On Sat, Apr 3, 2021 at 10:40 AM Swaga

Re: Flink - Pod Identity

2021-04-03 Thread Austin Cawley-Edwards
s access to S3, shouldn't we have a > simpler mechanism to connect to underlying resources based on the service > account authorization? > > On Sat, Apr 3, 2021, 10:10 PM Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > >> Hi Swagat, >> >> I’ve used

Re: Flink - Pod Identity

2021-04-03 Thread Austin Cawley-Edwards
://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html [2]: https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/config.html#kubernetes-service-account On Sat, Apr 3, 2021 at 10:18 PM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > Can you desc

Re: Flink - Pod Identity

2021-04-05 Thread Austin Cawley-Edwards
>> The webhook can make the STS call for the service account to role mapping. >> A side car container to which the main container has no access can even >> renew credentials becoz STS returns temp credentials. >> >> Sent from my iPhone >> >> On Apr 3, 2021, at 10:

Re: Flink - Pod Identity

2021-04-05 Thread Austin Cawley-Edwards
at are your thoughts on this? > > presto.s3.credentials-provider > > > On Tue, Apr 6, 2021 at 12:43 AM Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > >> I've confirmed that for the bundled + shaded aws dependency, the only way >> to upgrade i

Re: Flink - Pod Identity

2021-04-05 Thread Austin Cawley-Edwards
e.org/jira/browse/FLINK-18676 On Mon, Apr 5, 2021 at 4:53 PM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > That looks interesting! I've also found the full list of S3 properties[1] > for the version of presto-hive bundled with Flink 1.12 (see [2]), which > include

Re: Flink - Pod Identity

2021-04-06 Thread Austin Cawley-Edwards
for your help and support. Especially Austin, he stands > out due to his interest in the issue and helping to find ways to resolve it. > > Regards, > Swagat > > On Tue, Apr 6, 2021 at 2:35 AM Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > >> And ac

Re: Async + Broadcast?

2021-04-07 Thread Austin Cawley-Edwards
Hey Alex, I'm not sure if there is a best practice here, but what I can tell you is that I worked on a job that did exactly what you're suggesting with a non-async operator to create a [record, config] tuple, which was then passed to the async stage. Our config objects were also not tiny (~500kb)

Re: CRD compatible with native and standalone mode

2021-04-20 Thread Austin Cawley-Edwards
Hi Gaurav, I think the name "Native Kubernetes" is a bit misleading – this just means that you can use the Flink CLI/ scripts to run Flink applications on Kubernetes without using the Kubernetes APIs/ kubectl directly. What features are you looking to use in the native mode? I think it would be d

Re: Flink support for Kafka versions

2021-04-20 Thread Austin Cawley-Edwards
Hi Prasanna, It looks like the Kafka 2.5.0 connector upgrade is tied to dropping support for Scala 2.11. The best place to track that would be the ticket for Scala 2.13 support, FLINK-13414 [1], and its subtask FLINK-20845 [2]. I have listed FLINK-20845 as a blocker for FLINK-19168 for better vis

Re: Alternatives to JDBCAppendTableSink in Flink 1.11

2021-04-20 Thread Austin Cawley-Edwards
Hey Sambaran, I'm not too familiar with the 1.7 JDBCAppendTableSink, but to make sure I understand what you're current solution looks like, it's something like the following, where you're triggering a procedure on each element of a stream? JDBCAppendTableSink sink = JDBCAppendTableSink.buil

Re: Are configs stored as part of savepoints

2021-04-20 Thread Austin Cawley-Edwards
Hi Guarav, Which configs are you referring to? Everything usually stored in `flink-conf.yaml`[1]? The State Processor API[2] is also a good resource to understand what is actually stored, and how you can access it outside of a running job. The SavepointMetadata class[3] is another place to referen

Re: Max-parellelism limitation

2021-04-20 Thread Austin Cawley-Edwards
Hi Olivier, Someone will correct me if I'm wrong, but I believe the max-parallelism limitation, where you cannot scale up past the previously defined max-parallelism, applies to all stateful jobs no matter which type of state you are using. If you haven't seen it already, I think the Production R

Re: Alternatives to JDBCAppendTableSink in Flink 1.11

2021-04-20 Thread Austin Cawley-Edwards
tStreamOperator . > Can you please let me know if this is the right approach and if yes, how do > we map the SingleOutputStreamOperator to the preparedstatement in > JdbcStatementBuilder? > > Thanks for your help! > > Regards > Sambaran > > On Tue, Apr 20, 2021 at 6:3

Re: Alternatives to JDBCAppendTableSink in Flink 1.11

2021-04-21 Thread Austin Cawley-Edwards
t; > On Tue, Apr 20, 2021 at 8:11 PM Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > >> Hi Sambaran, >> >> I'm not sure if this is the best approach, though I don't know your full >> use case/ implementation. >> >> What

Re: Setup of Scala/Flink project using Bazel

2021-05-03 Thread Austin Cawley-Edwards
Hey Salva, This appears to be a bug in the `bazel-deps` tool, caused by mixing scala and Java dependencies. The tool seems to use the same target name for both, and thus produces duplicate targets (one for scala and one for java). If you look at the dict lines that are reported as conflicting, yo

Re: Setup of Scala/Flink project using Bazel

2021-05-04 Thread Austin Cawley-Edwards
Great! Feel free to post back if you run into anything else or come up with a nice template – I agree it would be a nice thing for the community to have. Best, Austin On Tue, May 4, 2021 at 12:37 AM Salva Alcántara wrote: > Hey Austin, > > There was no special reason for vendoring using `bazel-

Re: StopWithSavepoint() method doesn't work in Java based flink native k8s operator

2021-05-04 Thread Austin Cawley-Edwards
Hey all, Thanks for the ping, Matthias. I'm not super familiar with the details of @Yang Wang 's operator, to be honest :(. Can you share some of your FlinkApplication specs? For the `kubectl logs` command, I believe that just reads stdout from the container. Which logging framework are you using

Re: Setup of Scala/Flink project using Bazel

2021-05-12 Thread Austin Cawley-Edwards
Hi Salva, I think you're almost there. Confusion is definitely not helped by the ADDONS/ PROVIDED_ADDONS thingy – I think I tried to get too fancy with that in the linked thread. I think the only thing you have to do differently is to adjust the target you are building/ deploying – instead of `//

Re: Setup of Scala/Flink project using Bazel

2021-05-12 Thread Austin Cawley-Edwards
Yikes, I see what you mean. I also can not get `neverlink` or adding the org.scala.lang artifacts to the deploy_env to remove them from the uber jar. I'm not super familiar with sbt/ scala, but do you know how exactly the assembly `includeScala` works? Is it just a flag that is passed to scalac?

Re: Setup of Scala/Flink project using Bazel

2021-05-12 Thread Austin Cawley-Edwards
I know @Aaron Levin is using `rules_scala` for building Flink apps, perhaps he can help us out here (and hope he doesn't mind the ping). On Wed, May 12, 2021 at 4:13 PM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > Yikes, I see what you mean. I also can not get

Re: Apache Flink - A question about Tables API and SQL interfaces

2021-05-12 Thread Austin Cawley-Edwards
Hi Mans, I don't believe there are explicit triggers/evictors/timers in the Table API/ SQL, as that is abstracted away from the lower-level DataStream API. If you need to get into the fine-grained details, Flink 1.13 has made some good improvements in going from the Table API to the DataStream API

Re: Regarding Stateful Functions

2021-05-12 Thread Austin Cawley-Edwards
Hey Jessy, I'm not a Statefun expert but, hopefully, I can point you in the right direction for some of your questions. I'll also cc Gordan, who helps to maintain Statefun. *1. Is the stateful function a good candidate for a system(as above) that > should process incoming requests at the rate of

Re: Apache Flink - A question about Tables API and SQL interfaces

2021-05-13 Thread Austin Cawley-Edwards
Please let > me know if I missed anything. > > Thanks again. > > > On Wednesday, May 12, 2021, 04:36:06 PM EDT, Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > > > Hi Mans, > > I don't believe there are explicit triggers/evictors/timers in the

Re: RabbitMQ source does not stop unless message arrives in queue

2021-05-13 Thread Austin Cawley-Edwards
Hey Jose, Thanks for bringing this up – it indeed sounds like a bug. There is ongoing work to update the RMQ source to the new interface, which might address some of these issues (or should, if it is not already), tracked in FLINK-20628[1]. Would you be able to create a JIRA issue for this, or wou

Re: Helm chart for Flink

2021-05-17 Thread Austin Cawley-Edwards
Hi Pedro, There is currently no official Kubernetes Operator for Flink and, by extension, there is no official Helm chart. It would be relatively easy to create a chart for simply deploying standalone Flink resources via the Kubernetes manifests described here[1], though it would leave out the abi

Re: Helm chart for Flink

2021-05-18 Thread Austin Cawley-Edwards
be possible to use Helm pre-upgrade hook to take > savepoint and stop currently running job and then Helm will upgrade image > tags. The problem is that if you hit timeout while taking savepoint, it is > not clear how to recover from this situation > > Alexey > ---------

Re: RabbitMQ source does not stop unless message arrives in queue

2021-05-18 Thread Austin Cawley-Edwards
ing an exception in the Flink job is not the same as triggering a > stateful shutdown, but it might be hitting similar unack issues. > > John > > -- > *From:* Austin Cawley-Edwards > *Sent:* Thursday 13 May 2021 16:49 > *To:* Jose Vargas ; John Morr

Re: Flink PrometheusReporter support for HTTPS

2021-06-12 Thread Austin Cawley-Edwards
Hi Ashutosh, How are you deploying your Flink apps? Would running a reverse proxy like Nginx or Envoy that handles the HTTPS connection work for you? Best, Austin On Sat, Jun 12, 2021 at 1:11 PM Ashutosh Uttam wrote: > Hi All, > > Does PrometheusReporter provide support for HTTPS?. I couldn't

Re: Flink PrometheusReporter support for HTTPS

2021-06-16 Thread Austin Cawley-Edwards
>> have multiple JMs & TMs and cannot use static scrape targets >> >> Regards, >> Ashutosh >> >> On Sun, Jun 13, 2021 at 2:25 AM Austin Cawley-Edwards < >> austin.caw...@gmail.com> wrote: >> >>> Hi Ashutosh, >>> >>>

Re: Protobuf + Confluent Schema Registry support

2021-06-30 Thread Austin Cawley-Edwards
Hi Vishal, I don't believe there is another way to solve the problem currently besides rolling your own serializer. For the Avro + Schema Registry format, is this Table API format[1] what you're referring to? It doesn't look there have been discussions around adding a similar format for Protobuf

Re: Jobmanagers are in a crash loop after upgrade from 1.12.2 to 1.13.1

2021-06-30 Thread Austin Cawley-Edwards
Hi Shilpa, Thanks for reaching out to the mailing list and providing those logs! The NullPointerException looks odd to me, but in order to better guess what's happening, can you tell me a little bit more about what your setup looks like? How are you deploying, i.e., standalone with your own manife

Re: Jobmanagers are in a crash loop after upgrade from 1.12.2 to 1.13.1

2021-07-01 Thread Austin Cawley-Edwards
Hi Shilpa, >> >> JobType was introduced in 1.13. So I guess the cause is that the client >> which creates and submit >> the job is still 1.12.2. The client generates a outdated job graph which >> does not have its JobType >> set and resulted in this NPE

Re: Recommended approach to debug this

2019-09-22 Thread Austin Cawley-Edwards
Have you reached out to the FlinkK8sOperator team on Slack? They’re usually pretty active on there. Here’s the link: https://join.slack.com/t/flinkk8soperator/shared_invite/enQtNzIxMjc5NDYxODkxLTEwMThmN2I0M2QwYjM3ZDljYTFhMGRiNDUzM2FjZGYzNTRjYWNmYTE1NzNlNWM2YWM5NzNiNGFhMTkxZjA4OGU Best, Austin

Flink 1.8.0 S3 Checkpoints fail with java.security.ProviderException: Could not initialize NSS

2019-10-10 Thread Austin Cawley-Edwards
Hi there, I'm getting the following error message on a Flink 1.8 cluster deployed on Kubernetes. I've already confirmed that the pod has access to S3 and write permissions to the bucket, but I can't understand what the SSL issue is and if it is related to S3 or not. I have tried both with the defa

Flink x GATE Integration

2019-12-17 Thread Austin Cawley-Edwards
ely new to GATE, but trying to explore the options out there for good text processing within Flink. Best, Austin Cawley-Edwards [1]: https://gate.ac.uk [2]: https://groups.io/g/gate-users/topic/usage_with_data_processing/68761558?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,68761558

Re: Custom label for Prometheus Exporter

2020-01-22 Thread Austin Cawley-Edwards
Hey Anaray, Have you checked out the “scope” configuration?[1] Best, Austin [1]: https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#scope On Wed, Jan 22, 2020 at 4:09 PM anaray wrote: > Hi flink team, > > Is there a way to add a custom label to flink metrics when

Re: Custom label for Prometheus Exporter

2020-01-22 Thread Austin Cawley-Edwards
Cawley-Edwards < austin.caw...@gmail.com> wrote: > Hey Anaray, > > Have you checked out the “scope” configuration?[1] > > Best, > Austin > > > [1]: > > https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#scope > > On Wed, J

CSV StreamingFileSink

2020-02-18 Thread Austin Cawley-Edwards
Hey all, Has anyone had success using the StreamingFileSink[1] to write CSV files? And if so, what about compressed (Gzipped, ideally) files/ which libraries did you use? Best, Austin [1]: https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html

Re: CSV StreamingFileSink

2020-02-18 Thread Austin Cawley-Edwards
2020-02-19/ large-file-3.txt Or is there another way to accomplish this? Thanks! Austin On Tue, Feb 18, 2020 at 5:33 PM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > Hey all, > > Has anyone had success using the StreamingFileSink[1] to write CSV files? >

Re: CSV StreamingFileSink

2020-02-19 Thread Austin Cawley-Edwards
he output data. > > This should help for your use case: > > > https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html#bucket-assignment > > Regards, > Timo > > > On 19.02.20 01:00, Austin Cawley-Edwards wrote: > > Following up

StreamingFileSink Not Flushing All Data

2020-02-21 Thread Austin Cawley-Edwards
Hi there, Using Flink 1.9.1, trying to write .tgz files with the StreamingFileSink#BulkWriter. It seems like flushing the output stream doesn't flush all the data written. I've verified I can create valid files using the same APIs and data on there own, so thinking it must be something I'm doing w

Setting Flink Monitoring API Port on YARN Cluster

2018-09-06 Thread Austin Cawley-Edwards
Hi everyone, I'm running a YARN session on a cluster with one master and one core and would like to use the Monitoring API programmatically to submit jobs. I have found that the configuration variables are read but ignored when starting the session - it seems to choose a random port each run. Her

Re: Setting Flink Monitoring API Port on YARN Cluster

2018-09-07 Thread Austin Cawley-Edwards
> > [4] > https://hadoop.apache.org/docs/r2.8.3/api/org/apache/hadoop/yarn/api/records/ApplicationReport.html#getRpcPort() > > On Fri, Sep 7, 2018 at 12:33 AM, Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > >> Hi everyone, >> >> I&#x

Triggering Savepoints with the Monitoring API

2018-09-10 Thread Austin Cawley-Edwards
Hi there, I would just like a quick sanity-check: it is possible to start a job with a savepoint from another job, and have the new job save to a new checkpoint directory without overwriting the original checkpoints, correct? Thank you so much! Austin

Re: Triggering Savepoints with the Monitoring API

2018-09-11 Thread Austin Cawley-Edwards
that these two jobs will be subject > to certain restrictions. For more information, please refer to Official > documentation.[1] > > Thanks, vino. > > [1]: > https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/upgrading.html#application-state-compatibility > >

Accumulating a batch

2018-10-25 Thread Austin Cawley-Edwards
Hi there, Is it possible to use an AggregationFunction to accumulate n values in a buffer until a threshold is met, then stream down all records in the batch? Thank you! Austin Cawley-Edwards

Re: Accumulating a batch

2018-10-26 Thread Austin Cawley-Edwards
> [2] > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/udfs.html#aggregation-functions > > On Fri, Oct 26, 2018 at 5:08 AM Austin Cawley-Edwards < > austin.caw...@gmail.com> wrote: > >> Hi there, >> >> Is it possible to use an Aggregatio

Flink CEP Watermark Exception

2018-10-30 Thread Austin Cawley-Edwards
Hi there, We have a streaming application that uses CEP processing but are getting this error fairly frequently after a checkpoint fails, though not sure if it is related. We have implemented both `hashCode` and `equals()` using `Objects.hash(...properties)` and basic equality, respectively. Has

Re: Flink CEP Watermark Exception

2018-10-30 Thread Austin Cawley-Edwards
Following up, we are using Flink 1.5.0 and Flink-CEP 2.11. Thanks, Austin On Tue, Oct 30, 2018 at 3:58 PM Austin Cawley-Edwards < austin.caw...@gmail.com> wrote: > Hi there, > > > We have a streaming application that uses CEP processing but are getting this > error fair

Re: Flink CEP Watermark Exception

2018-11-01 Thread Austin Cawley-Edwards
t do you mean by "after a checkpoint > fails", what is the reason why checkpoint fails? Would it be possible for > you to prepare some reproducible example for that problem? Finally, I would > also recommend trying out Flink 1.6.x, as we reworked the underlying > structure

Re: Flink CEP Watermark Exception

2018-11-06 Thread Austin Cawley-Edwards
wo issues regarding CEP you've linked concern very old Flink > version (1.0.3), CEP library was heavily reworked since then and I would > not look for any similarities in those cases. > > Best, > > Dawid > On 01/11/2018 14:24, Austin Cawley-Edwards wrote: > > Hi

Re: Running two versions of Flink with testcontainers

2021-07-13 Thread Austin Cawley-Edwards
You might also be able to put them in separate networks[1] to get around changing all the ports and still ensuring that they don't see eachother. [1]: https://www.testcontainers.org/features/networking#advanced-networking On Tue, Jul 13, 2021 at 9:07 AM Chesnay Schepler wrote: > It is possible

Re: Running two versions of Flink with testcontainers

2021-07-13 Thread Austin Cawley-Edwards
Great, glad it worked out for you! On Tue, Jul 13, 2021 at 10:32 AM Farouk wrote: > Thanks > > Finally I tried by running docker commands (thanks for the documentation) > and it works fine. > > Thanks > Farouk > > Le mar. 13 juil. 2021 à 15:48, Austin Cawley-Edwards

Re: Recover from savepoints with Kubernetes HA

2021-07-21 Thread Austin Cawley-Edwards
Hi Thomas, I've got a few questions that will hopefully help get to find an answer: What job properties are you trying to change? Something like parallelism? What mode is your job running in? i.e., Session, Per-Job, or Application? Can you also describe how you're redeploying the job? Are you u

Re: Recover from savepoints with Kubernetes HA

2021-07-22 Thread Austin Cawley-Edwards
down, update the start > parameter with the recent savepoint and renamed ‚kubernetes.cluster-id‘ as > well as ‚high-availability.storageDir‘. > > When I trigger a savepoint with cancel, I also see that the HA config maps > are cleaned up. > > > Kr Thomas > > Austi

Re: Recover from savepoints with Kubernetes HA

2021-07-23 Thread Austin Cawley-Edwards
ckpoint pointers) stored in the ConfigMap/ZNode will be deleted. >> >> But it is strange that the "-s/--fromSavepoint" does not take effect when >> redeploying the Flink application. The JobManager logs could help a lot to >> find the root cause. >> >>

Re: Can we release new flink-connector-redis? Thanks!

2021-08-17 Thread Austin Cawley-Edwards
Hi Hongbo, Thanks for your interest in the Redis connector! I'm not entirely sure what the release process is like for Bahir, but I've pulled in @Robert Metzger who has been involved in the project in the past and can give an update there. Best, Austin On Tue, Aug 17, 2021 at 10:41 AM Hongbo Mi

Re: HTTP or REST SQL Client

2021-10-02 Thread Austin Cawley-Edwards
Hi Declan, I think the Flink-sql-gateway[1] is what you’re after, though I’m not sure of its current state. I’m cc’ing Ingo, who may be able to help direct us. Best, Austin [1]: https://github.com/ververica/flink-sql-gateway On Sat, Oct 2, 2021 at 10:56 AM Declan Harrison wrote: > Hi All > >

Re: Flink + K8s

2021-11-02 Thread Austin Cawley-Edwards
Hi Rommel, That’s correct that K8s will restart the JM pod (assuming it’s been created by a K8s Job or Deployment), and it will pick up the HA data and resume work. The only use case for having multiple replicas is faster failover, so you don’t have to wait for K8s to provision that new pod (which

Re: IterativeCondition instead of SimpleCondition not matching pattern

2021-11-04 Thread Austin Cawley-Edwards
Hi Isidoros, Thanks for reaching out to the mailing list. I haven't worked with the CEP library in a long time but can try to help. I'm having a little trouble understanding the desired output + rules. Can you mock up the desired output like you have for the fulfilled pattern sequence? Best, Aust

Re: GenericWriteAheadSink, declined checkpoint for a finished source

2021-11-04 Thread Austin Cawley-Edwards
Hi James, You are correct that since Flink 1.14 [1] (which included FLIP-147 [2]) there is support for checkpointing after some tasks has finished, which sounds like it will solve this use case. You may also want to look at the JDBC sink[3] which also supports batching, as well as some other nice

Re: IterativeCondition instead of SimpleCondition not matching pattern

2021-11-04 Thread Austin Cawley-Edwards
st, Austin On Thu, Nov 4, 2021 at 4:09 PM Isidoros Ioannou wrote: > > > -- Forwarded message - > Από: Isidoros Ioannou > Date: Πέμ, 4 Νοε 2021 στις 10:01 μ.μ. > Subject: Re: IterativeCondition instead of SimpleCondition not matching > pattern > To: Aust

Re: Running a performance benchmark load test of Flink on new CPUs

2021-11-05 Thread Austin Cawley-Edwards
Hi Vijay, I'm not too familiar with the subject, but maybe you could have a look at the flink-faker[1], which generates fake data. I would think you could use it to write to kafka in one Flink job, and then have another Flink job to ingest and run your benchmarks. There is also this microbenchmar

Re: stream consume from kafka after DB scan is done

2021-11-05 Thread Austin Cawley-Edwards
Hey Qihua, If I understand correctly, you should be able to with the HybridSource, released in 1.14 [1] Best, Austin [1]: https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/connectors/datastream/hybridsource/ On Fri, Nov 5, 2021 at 3:53 PM Qihua Yang wrote: > Hi, > > Our stream h

  1   2   >