Re: [EXTERNAL] Re: Secure Azure Credential Configuration

2023-03-06 Thread Alexis Sarda-Espinosa
Hello, I actually needed this myself so I have validated it. Again, this is if you want Flink itself to access Azure, and I'm fairly certain you have to use Java because the plugin's class loader won't have access to the Scala library's jars. * You have to build against https://mvnrepository.com/

Re: [EXTERNAL] Re: Secure Azure Credential Configuration

2023-03-06 Thread Swathi C
Hi Ivan, You can try to setup using MSI so that the flink pods access the storage account and you might need to add the podIdentity to the flink pod so that it can access it. ( MSI should have the access for the storage account as well ) The pod identity will have the required permissions to acces

RE: [EXTERNAL] Re: Secure Azure Credential Configuration

2023-03-06 Thread Ivan Webber via user
Thanks for the pointers Alexis! Implementing `org.apache.hadoop.fs.azure.KeyProvider` has helped me make progress, but I’m running into a new error: ``` org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.azure.KeyProviderException: org.tryflink.wrappers.TrafficForecastEnvKeyProviderWrapper

Re: Avoiding data shuffling when reading pre-partitioned data from Kafka

2023-03-06 Thread Tommy May
Hi Ken & David, Thanks for following up. I've responded to your questions below. If the number of unique keys isn’t huge, I could think of yet another > helicopter stunt that you could try :) Unfortunately the number of keys in our case is huge, they're unique per handful of events. If your d

Re: Create generic DeserializationSchema (Scala)

2023-03-06 Thread Alexey Novakov via user
Hi Ana, I think you will need to deal with ClassTag to keep all the code generic. I've found such example which should help: https://github.com/amzn/milan/blob/7dfa29b434ced7eef286ea34c5085c10c1b787b6/milan/milan-compilers/milan-flink-compiler/src/main/scala/com/amazon/milan/compiler/flink/serial

Create generic DeserializationSchema (Scala)

2023-03-06 Thread Ana Gómez González
Hello! First time emailing one doubt to this mailing list, hope I'm not messing anything up. I'm not fully sure if what I want to do it's conceptually correct, so pls let me know. I want to create a generic class that extends a DeserializationSchema. I want an easy way of creating different deser

Failing to process timestamp data from Kafka + Debezium Avro using Flink SQL

2023-03-06 Thread Frank Lyaruu
Hi all, I'm trying to ingest change capture data data from Kafka which contains some timestamps. I'm using Flink SQL, and I'm running into issues, specifically with the created_at field. //In postgres, it is of type 'timestamptz'. My table definition is this: CREATE TABLE contacts ( contact_id STR

Re: Avoiding data shuffling when reading pre-partitioned data from Kafka

2023-03-06 Thread David Morávek
Using an operator state for a stateful join isn't great because it's meant to hold only a minimal state related to the operator (e.g., partition tracking). If your data are already pre-partitioned and the partitioning matches (hash partitioning on the JAVA representation of the key yielded by the

Re: obtain the broadcast stream information in sink

2023-03-06 Thread Weihua Hu
Hi, Could you describe your usage scenario in detail? Why do you need to get the broadcast stream in sink? And could you split an operator from the sink to deal with broadcast stream? Best, Weihua On Mon, Mar 6, 2023 at 10:57 AM zhan...@eastcom-sw.com < zhan...@eastcom-sw.com> wrote: > > h