ElasticSearch 8 Sink Kryo Deserialization Error

2025-05-21 Thread Andrey
e for your support. Kind regards, Andrey Starostin

Re: Table API/SQL CDC Join produces deletes

2025-05-19 Thread Andrey
tor. Hence, we need both records to keep the state up to date correctly. I have also followed a suggestion found online to rerank items after a join, to get the unique key back and optimize downstream joins. This worked well. Thank you for your support. Best regards, Andrey Starostin On Wed, Apr 1

Table API/SQL CDC Join produces deletes

2025-03-29 Thread Andrey
owever, I wonder if there is a way to somehow preserve the `+U` through the Join, as that would reduce the amount of events going through the system? Thank you in advance for taking the time to read and answer this question, I really appreciate the help. Kind regards, Andrey Starostin

Re: No effect from --allowNonRestoredState or "execution.savepoint.ignore-unclaimed-state" in K8S application mode

2022-02-22 Thread Andrey Bulgakov
lementation[1]. > > [1]. > https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/cli/ProgramOptions.java#L181 > > > Best, > Yang > > Andrey Bulgakov 于2022年2月19日周六 08:30写道: > >> Hi Austin, >>

Re: No effect from --allowNonRestoredState or "execution.savepoint.ignore-unclaimed-state" in K8S application mode

2022-02-18 Thread Andrey Bulgakov
ntials.provider=com.amazonaws.auth.WebIdentityTokenCredentialsProvider \ -Dio.tmp.dirs=/data/flink-local-data \ -Dqueryable-state.enable=true \ The only one i'm having problems with is "execution.savepoint.ignore-unclaimed-state". On Fri, Feb 18, 2022 at 3:42 PM Austin Cawley-Edwards < austin.caw.

No effect from --allowNonRestoredState or "execution.savepoint.ignore-unclaimed-state" in K8S application mode

2022-02-18 Thread Andrey Bulgakov
r am I doing something wrong? For context, the savepoint is produced by Flink 1.8.2 and the version I'm trying to run on K8S is 1.14.3. -- With regards, Andrey Bulgakov

Re: Extracting state keys for a very large RocksDB savepoint

2021-03-17 Thread Andrey Bulgakov
I guess there's no point in making it a KeyedProcessFunction since it's not going to have access to context, timers or anything like that. So it can be a simple InputFormat returning a DataSet of key and value tuples. On Wed, Mar 17, 2021 at 8:37 AM Andrey Bulgakov wrote: > Hi

Re: Extracting state keys for a very large RocksDB savepoint

2021-03-17 Thread Andrey Bulgakov
ut what works. Please let me know if you have thoughts about this. On Sun, Mar 14, 2021 at 11:55 PM Tzu-Li (Gordon) Tai wrote: > Hi Andrey, > > Perhaps the functionality you described is worth adding to the State > Processor API. > Your observation on how the library currently works

Re: Extracting state keys for a very large RocksDB savepoint

2021-03-14 Thread Andrey Bulgakov
KeyGroupsStateHandle objects and built an InputFormat heavily based on the code I found in RocksDBFullRestoreOperation.java. It ended up working extremely quickly while keeping memory and CPU usage at the minimum. On Tue, Mar 9, 2021 at 1:51 PM Andrey Bulgakov wrote: > Hi all, > > I'm

Extracting state keys for a very large RocksDB savepoint

2021-03-09 Thread Andrey Bulgakov
es available to each task manager - enabling object reuse and modifying the tuple mapper to avoid extra tuple allocations - manipulating memory ratios to allocate more memory to be used as heap, managed - allocating 20% of memory for JVM overhead - switching to G1GC garbage collector Again, would appreciate any help with this. -- With regards, Andrey Bulgakov

Re: Flink 1.9Version State TTL parameter configuration it does not work

2020-12-04 Thread Andrey Zagrebin
Flink, before any state is created and it starts to load. Which version of Flink do you use? Did you try to enable the filter without starting from the checkpoint, basically from the beginning of the job run? Best, Andrey On Fri, Dec 4, 2020 at 11:27 AM Yang Peng wrote: > Hi,I have some questi

Re: Caching Mechanism in Flink

2020-11-19 Thread Andrey Zagrebin
ode but not intended to be used by Flink. Best, Andrey On Wed, Nov 11, 2020 at 3:06 PM Jack Kolokasis wrote: > Hi Matthias, > > Yeap, I am refer to the tasks' off-heap configuration value. > > Best, > Iacovos > On 11/11/20 1:37 μ.μ., Matthias Pohl wrote: > > When t

Re: Logs of JobExecutionListener

2020-11-19 Thread Andrey Zagrebin
t; > Which version are you using? > > I used the exact same commands on Flink 1.11.0 and I didn't get the job > > listener output.. > > > > Il gio 19 nov 2020, 12:53 Andrey Zagrebin ha > scritto: > > > >> Hi Flavio and Aljoscha, > >>

Re: Logs of JobExecutionListener

2020-11-19 Thread Andrey Zagrebin
execution finished Job with JobID c454a894d0524ccb69943b95838eea07 has finished. Job Runtime: 139 ms EXECUTED Best, Andrey On Thu, Nov 19, 2020 at 2:40 PM Aljoscha Krettek wrote: > JobListener.onJobExecuted() is only invoked in > ExecutionEnvironme

Re: Logs of JobExecutionListener

2020-11-19 Thread Andrey Zagrebin
Hi Flavio, I think I can reproduce what you are reporting (assuming you also pass '--output' to 'flink run'). I am not sure why it behaves like this. I would suggest filing a Jira ticket for this. Best, Andrey On Wed, Nov 18, 2020 at 9:45 AM Flavio Pompermaier wrote: >

Re: Strange behaviour when using RMQSource in Flink 1.11.2

2020-11-18 Thread Andrey Zagrebin
t for this in Flink, you can write your suggestion there as well and once there is positive feedback from a committer, a github PR can be opened. Best, Andrey [1] https://issues.apache.org/jira/projects/FLINK On Wed, Nov 18, 2020 at 3:49 PM Thomas Eckestad < thomas.eckes...@verisure.com> w

Re: Support of composite data types in flink-parquet

2020-10-20 Thread Andrey Zagrebin
Hi Jon, I have found this ticket [1]. It looks like what you are looking for. Best, Andrey [1] https://issues.apache.org/jira/browse/FLINK-17782 On Tue, Oct 20, 2020 at 4:50 PM Jon Alberdi wrote: > Hello, as stated at > https://ci.apache.org/projects/flink/flink-docs-stable/dev

Re: rename error in flink sql

2020-10-20 Thread Andrey Zagrebin
Hi, I am not an SQL expert but I would not expect the original POJO to match the new row with the renamed field. Maybe Timo or Dawid have to add something. Best, Andrey On Tue, Oct 20, 2020 at 4:56 PM 大森林 wrote: > > I'm learning "select"from > official document

Re: what's the new version of createTemporaryView?

2020-10-20 Thread Andrey Zagrebin
"user,product,amount")); Best, Andrey On Tue, Oct 20, 2020 at 1:38 PM 大森林 wrote: > > > my code is: > > tEnv.createTemporaryView("orderA", orderA, "user,product,amount"); > > > I got the hint: > > > createTemporaryView is deprecated >

Re: flink session job retention time

2020-10-09 Thread Andrey Zagrebin
Hi Richard, If you mean the retention of completed jobs, there are following options: jobstore.cache-size [1] jobstore.expiration-time [2] jobstore.max-capacity [3] Best, Andrey [1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/config.html#jobstore-cache-size [2] https

Re: Flink DynamoDB stream connector losing records

2020-09-10 Thread Andrey Zagrebin
AWS support > but got no response except for increasing WCU and RCU. > > Is it possible that Flink will lose exactly-once semantics when throttled? > > On Thu, Sep 10, 2020 at 10:31 PM Jiawei Wu > wrote: > >> Hi Andrey, >> >> Thanks for your suggestion, but I'

Re: Flink DynamoDB stream connector losing records

2020-09-10 Thread Andrey Zagrebin
Hi Jiawei, Could you try Flink latest release 1.11? 1.8 will probably not get bugfix releases. I will cc Ying Xu who might have a better idea about the DinamoDB source. Best, Andrey On Thu, Sep 10, 2020 at 3:10 PM Jiawei Wu wrote: > Hi, > > I'm using AWS kinesis analytics ap

Re: Flink OnCheckpointRollingPolicy streamingfilesink

2020-08-29 Thread Andrey Zagrebin
y a handful of records in between if you do not observe any practical decrease of latency. The system will just waste resources to process the checkpoints. Best, Andrey On Fri, Aug 28, 2020 at 9:52 PM Vijayendra Yadav wrote: > Hi Andrey, > > Thanks, > what is rec

Re: OOM error for heap state backend.

2020-08-26 Thread Andrey Zagrebin
manager runs one JVM system process on the machine where it is deployed to. Best, Andrey On Wed, Aug 26, 2020 at 6:52 PM Vishwas Siravara wrote: > Hi Andrey, > Thanks for getting back to me so quickly. The screenshots are for 1GB > heap, the keys for the state are 20 character str

Re: Example flink run with security options? Running on k8s in my case

2020-08-26 Thread Andrey Zagrebin
Hi Adam, maybe also check your SSL setup in a local cluster to exclude possibly related k8s things. Best, Andrey On Wed, Aug 26, 2020 at 3:59 PM Adam Roberts wrote: > Hey Nico - thanks for the prompt response, good catch - I've just tried > with the two security options (enabli

Re: OOM error for heap state backend.

2020-08-26 Thread Andrey Zagrebin
Xintong suggested. Best, Andrey On Wed, Aug 26, 2020 at 4:21 PM Vishwas Siravara wrote: > Hi Andrey and Xintong. 2.5 GB is from the flink web UI( checkpoint size). > I took a heap dump and I could not find any memory leak from user code. I > see the similar behaviour on smaller heap size,

Re: How jobmanager and task manager communicates with each other ?

2020-08-25 Thread Andrey Zagrebin
new configuration to connect to another job manager. (2) Which and how flink's HA service can be used for the service discovery > of job manager ? You can check the docs for the zookeeper implementation of the HA in Flink [1] Best, Andrey [1] https://ci.apache.org/projects/flin

Re: Transition Flink job from Java to Scala with state migration

2020-08-25 Thread Andrey Zagrebin
] When you restore the state after changing the job, the new serializer, used for the state type, should be compatible with the serializer, used to store the state in the previous version of the job [2]. I also cc Gordon who could have more ideas about the problem. Best, Andrey [1] https

Re: Flink OnCheckpointRollingPolicy streamingfilesink

2020-08-25 Thread Andrey Zagrebin
checkpoint occurs to provide exactly once guarantee. Best, Andrey On Mon, Aug 24, 2020 at 6:18 PM Vijayendra Yadav wrote: > Hi Team, > > Bulk Formats can only have `OnCheckpointRollingPolicy`, which rolls (ONLY) > on every checkpoint. > > *.withRollingPolicy(OnCheckpointR

Re: OOM error for heap state backend.

2020-08-25 Thread Andrey Zagrebin
-heap size of state java objects. I never heard about such a Flink metric. Best, Andrey [1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/mem_setup.html On Mon, Aug 24, 2020 at 4:05 AM Xintong Song wrote: > Hi Vishwas, > > According to the log, heap space is 13+GB, wh

Re: flink1.10中hive module 没有plus,greaterThan等函数

2020-08-25 Thread Andrey Zagrebin
Hi Faaron, This mailing list is for support in English. Could you translate your question into English? You can also subscribe to the user mailing list in Chinese to get support in Chinese [1] Best, Andrey [1] user-zh-subscr...@flink.apache.org On Fri, Aug 21, 2020 at 4:43 AM faaron zheng

Re: Monitor the usage of keyed state

2020-08-25 Thread Andrey Zagrebin
Hi Mu, I would suggest to look into RocksDB metrics which you can enable as Flink metrics [1] Best, Andrey [1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/config.html#rocksdb-native-metrics On Fri, Aug 21, 2020 at 4:27 AM Mu Kong wrote: > Hi community, > > I hav

Re: Is there a way to start a timer without ever receiving an event?

2020-08-12 Thread Andrey Zagrebin
I do not think so. Each timer in KeyedProcessFunction is associated with the key. The key is implicitly set into the context from the record which is currently being processed. On Wed, Aug 12, 2020 at 8:00 AM Marco Villalobos wrote: > In the Stream API KeyedProcessFunction,is there a way to star

Re: JM & TM readiness probe

2020-08-12 Thread Andrey Zagrebin
state of TaskManager, it is all only over internal communication (currently akka). Maybe Chesnay has a better idea. We also use TCP port 6122 as TaskManager livenessProbe [2]. This might not qualify as a readiness probe though. Best, Andrey [1] https://ci.apache.org/projects/flink/flink-docs

Re: Purpose of starting LeaderRetrievalService in DefaultDispatcherResourceManagerComponentFactory#create

2020-07-02 Thread Andrey Zagrebin
in a way that they can be separate remote JVM processes, potentially running on different machines, it is just not the case now. If they are separate remote processes then each of them needs its own leader service to discover each other. This is why it is implemented like you see it. Best, Andrey

Re: Performance issue associated with managed RocksDB memory

2020-06-26 Thread Andrey Zagrebin
to be true and false. Anyways I do not know how we could control splitting of the configured managed memory among operators in a more optimal way. Best, Andrey [1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_setup.html#managed-memory <https://ci.apache.or

Re: Performance issue associated with managed RocksDB memory

2020-06-25 Thread Andrey Zagrebin
Hi Juha, Thanks for sharing the testing program to expose the problem. This indeed looks suboptimal if X does not leave space for the window operator. I am adding Yu and Yun who might have a better idea about what could be improved about sharing the RocksDB memory among operators. Best, Andrey

Re: Is State TTL possible with event-time characteristics ?

2020-06-17 Thread Andrey Zagrebin
round cleanup. Best, Andrey [1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/stream/state/state.html#cleanup-in-background On Wed, Jun 17, 2020 at 3:07 PM Congxian Qiu wrote: > Hi > Currently, Flink does not support event-time TTL state, there is an > issue[

Re: Internal state and external stores conditional usage advice sought (dynamodb asyncIO)

2020-06-09 Thread Andrey Zagrebin
Hi Orionemail, There is no simple state access in asyncIO operator. I think this would require a custom caching solution. Maybe, other community users solved this problem in some other way. Best, Andrey On Mon, Jun 8, 2020 at 5:33 PM orionemail wrote: > Hi, > > Following on from a

Re: Flink on yarn : yarn-session understanding

2020-06-09 Thread Andrey Zagrebin
Hi Anuj, Afaik, the REST API should work for both modes. What is the issue? Maybe, some network problem to connect to YARN application master? Best, Andrey On Mon, Jun 8, 2020 at 4:39 PM aj wrote: > I am running some stream jobs that are long-running always. I am currently > submittin

Re: Run command after Batch is finished

2020-06-08 Thread Andrey Zagrebin
, maybe, he has more ideas. Best, Andrey On Sun, Jun 7, 2020 at 1:35 PM Mark Davis wrote: > Hi Jeff, > > Unfortunately this is not good enough for me. > My clients are very volatile, they start a batch and can go away any > moment without waiting for it to finish. Think of

Re: Data Quality Library in Flink

2020-06-08 Thread Andrey Zagrebin
checks and failures as separate operators and side outputs (for streaming) [1], if not yet Then you could also use Flink metrics to aggregate and monitor the data [2]. The metrics systems usually allow to define alerts on metrics, like in prometheus [3], [4]. Best, Andrey [1] https://ci.apac

Re: The trigger of State TTL

2020-06-02 Thread Andrey Zagrebin
expired state [2]. Please, see the notes in the user documentation chapters referred by the suggested links. Best, Andrey [1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/state/state.html#incremental-cleanup [2] https://ci.apache.org/projects/flink/flink-docs-release-1.10

Re: Memory issue in Flink 1.10

2020-05-27 Thread Andrey Zagrebin
network buffers also use the JVM direct memory but Flink makes sure that they do not exceed their limit [5]. Best, Andrey [1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_setup.html#managed-memory [2] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory

Re: Flink 1.8.3 Kubernetes POD OOM

2020-05-24 Thread Andrey Zagrebin
till OOM and other types hopefully stabilise on some level. Then you could take a dump of that ever growing type of memory consumption to analyse if there is memory leak. Best, Andrey [1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/mem_setup.html#total-memory <ht

Re: Flink 1.8.3 Kubernetes POD OOM

2020-05-22 Thread Andrey Zagrebin
Hi Josson, Do you use state backend? is it RocksDB? Best, Andrey On Fri, May 22, 2020 at 12:58 PM Fabian Hueske wrote: > Hi Josson, > > I don't have much experience setting memory bounds in Kubernetes myself, > but my colleague Andrey (in CC) reworked Flink's memory c

Re: Issue with single job yarn flink cluster HA

2020-04-02 Thread Andrey Zagrebin
Hi Dinesh, Thanks for sharing the logs. There were couple of HA fixes since 1.7, e.g. [1] and [2]. I would suggest to try Flink 1.10. If the problem persists, could you also find the logs of the failed Job Manager before the failover? Best, Andrey [1] https://jira.apache.org/jira/browse/FLINK

Re: Issue with single job yarn flink cluster HA

2020-03-24 Thread Andrey Zagrebin
. no leader is elected or a job is not restarted after the current leader failure? Best, Andrey On Sun, Mar 22, 2020 at 11:14 AM Dinesh J wrote: > Attaching the job manager log for reference. > > 2020-03-22 11:39:02,693 WARN > org.apache.flink.runtime.webmonitor.retriever.impl.RpcGate

Re: [DISCUSS] FLIP-111: Docker image unification

2020-03-22 Thread Andrey Zagrebin
ith the *tee* command looks promising. I would prefer to write stdout/err into separate files and preserve them as stdout/err for container logs. This needs more experiments but may be possible with the *tee* command. I suggest to check the details in PRs. # Java/Python/Dev versiona Shipping officia

Re: Flink long state TTL Concerns

2020-03-20 Thread Andrey Zagrebin
ke sure they are serializable for Flink. Pay attention to migration if you want to evolve the types of your objects, it is not a trivial topic [1]. Best, Andrey On Fri, Mar 20, 2020 at 9:39 AM Matthew Rafael Magsombol < raffy4...@gmail.com> wrote: > And another additional followup! >

Re: Flink long state TTL Concerns

2020-03-19 Thread Andrey Zagrebin
CPU cycles for the background cleanup. This affects storage size and potentially processing latency per record. You can read about details and caveats in the docs: for heap state [3] and RocksDB [4]. Best, Andrey [1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/state

Re: Upgrade flink fail from 1.9.1 to 1.10.0

2020-03-18 Thread Andrey Zagrebin
failure. Best, Andrey [1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/state/savepoints.html On Wed, Mar 18, 2020 at 4:27 PM Reo Lei wrote: > Hi all, > > I encountered a problem when I upgrade flink from 1.9.1 to 1.10.0. > > At first, my job is running on flink

Re: Question about RocksDBStateBackend Compaction Filter state cleanup

2020-03-17 Thread Andrey Zagrebin
/aggregating/folding/reducing state: key -> value - map state: key -> map key -> value - list state: key -> list -> element in some position Best, Andrey On Tue, Mar 17, 2020 at 11:04 AM Yun Tang wrote: > Hi Lake > > Flink leverage RocksDB's background compaction mec

Re: [DISCUSS] FLIP-111: Docker image unification

2020-03-16 Thread Andrey Zagrebin
only docker specific. All in all, I would say, once we implement them, we can revisit this topic. Best, Andrey On Wed, Mar 11, 2020 at 8:58 AM Yangze Guo wrote: > Thanks for the reply, Andrey. > > Regarding building from local dist: > - Yes, I bring this up mostly for development p

Re: Restoring state from an incremental RocksDB checkpoint

2020-03-13 Thread Andrey Zagrebin
//ci.apache.org/projects/flink/flink-docs-release-1.10/ops/state/checkpoints.html#directory-structure> > On 14 Mar 2020, at 00:12, Andrey Zagrebin wrote: > > Hi Yuval, > > You should be able to restore from the last checkpoint by restarting the job > with the same checkpoint

Re: Restoring state from an incremental RocksDB checkpoint

2020-03-13 Thread Andrey Zagrebin
Hi Yuval, You should be able to restore from the last checkpoint by restarting the job with the same checkpoint directory. An incremental part is removed only if none of retained checkpoints points to it. Best, Andrey > On 13 Mar 2020, at 16:06, Yuval Itzchakov wrote: > > Hi, &

Re: [Survey] Default size for the new JVM Metaspace limit in 1.10

2020-03-13 Thread Andrey Zagrebin
by the suggested change. Any feedback is appreciated. Best, Andrey On Tue, Mar 3, 2020 at 6:35 PM Andrey Zagrebin wrote: > Hi All, > > Recently, FLIP-49 [1] introduced the new JVM Metaspace limit in the 1.10 > release [2]. Flink scripts, which start the task manager JVM process

Re: [DISCUSS] FLIP-111: Docker image unification

2020-03-10 Thread Andrey Zagrebin
maybe with a fix for the web user > interface). For everything else, users can easily build their own images > based on library/flink (provide the dependencies, change the logging > configuration). agree Thanks, Andrey On Sun, Mar 8, 2020 at 8:55 PM Konstantin Knauf wrote: > Hi And

Re: Flink Deployment failing with RestClientException

2020-03-05 Thread Andrey Zagrebin
been started in fact despite the client error? Best, Andrey [1] https://issues.apache.org/jira/browse/FLINK-16429 <https://issues.apache.org/jira/browse/FLINK-16429> [2] https://issues.apache.org/jira/browse/FLINK-16018 <https://issues.apache.org/jira/browse/FLINK-16018> > On 5 Mar 2

[DISCUSS] FLIP-111: Docker image unification

2020-03-04 Thread Andrey Zagrebin
contributed version of Flink docker integration also contained example and docs for the integration with Bluemix in IBM cloud. We also suggest to maintain it outside of Flink repository (cc Markus Müller). Thanks, Andrey [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-111%3A+Docker+image

[Survey] Default size for the new JVM Metaspace limit in 1.10

2020-03-03 Thread Andrey Zagrebin
, please, report any specifics of your job, if you think it is relevant for this concern, and the option value that resolved it. There is also a dedicated Jira issue [6] for reporting. Thanks, Andrey [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+

Re: Colocating Sub-Tasks across jobs / Sharing Task Slots across jobs

2020-02-27 Thread Andrey Zagrebin
scheduler to decide where to deploy each job subtask. The restart strategy is only for the failure scenario. Any jobs changes require full job restart at the moment. I pull in Gary and Zhu to add more details if I miss something here. Best, Andrey On Tue, Feb 25, 2020 at 1:38 PM Xintong Song

Re: Tests in FileUtilsTest while building Flink in local

2020-02-21 Thread Andrey Zagrebin
These tests also fail on my mac. It may be some mac os setup related issue. I create a JIRA ticket for that: https://issues.apache.org/jira/browse/FLINK-16198 > On 20 Feb 2020, at 12:03, Chesnay Schepler wrote: > > Is the stacktrace identical in both tests? > > Did these fail on the command-l

Re: Rescaling a running topology

2020-02-11 Thread Andrey Zagrebin
, there should not be a big difference but it depends on the job, of course, whether the rescale operation is faster or not. Thanks, Andrey [1] http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/DISCUSS-Temporarily-remove-support-for-job-rescaling-via-CLI-action-quot-modify-quot

Re: Task-manager kubernetes pods take a long time to terminate

2020-02-07 Thread Andrey Zagrebin
Hi guys, It looks suspicious that the TM pod termination is potentially delayed by the reconnect to a killed JM. I created an issue to investigate this: https://issues.apache.org/jira/browse/FLINK-15946 Let's continue the discussion there. Best, Andrey On Wed, Feb 5, 2020 at 11:49 AM Yang

Re: NoClassDefFoundError when an exception is throw inside invoke in a Custom Sink

2020-02-06 Thread Andrey Zagrebin
Hi David, This looks like a problem with resolution of maven dependencies or something. The custom WindowParquetGenericRecordListFileSink probably transitively depends on org/joda/time/format/DateTimeParserBucket and it is missing on the runtime classpath of Flink. Best, Andrey On Wed, Feb 5

Re: [DISCUSS] Change default for RocksDB timers: Java Heap => in RocksDB

2020-01-16 Thread Andrey Zagrebin
very limited subset of keys which fits into memory. Best, Andrey On Thu, Jan 16, 2020 at 4:27 PM Stephan Ewen wrote: > Hi all! > > I would suggest a change of the current default for timers. A bit of > background: > > - Timers (for windows, process functions, etc.) are state

Re: Scala ListBuffer cannot be used as a POJO type in Flink

2019-12-17 Thread Andrey Zagrebin
org.apache.flink.api.common.typeutils.TypeSerializer or register a custom serialiser [2] to use another state descriptor constructor: ListStateDescriptor(String name, TypeSerializer typeSerializer) or refactor your classes to support out of the box serialisation [3]. Best, Andrey [1] https

Re: Job manager is failing to start with an S3 no key specified exception [1.7.2]

2019-12-10 Thread Andrey Zagrebin
version of Flink 1.9? Best, Andrey On Mon, Dec 9, 2019 at 12:37 PM Kumar Bolar, Harshith wrote: > Hi all, > > > > I'm running a standalone Flink cluster with Zookeeper and S3 for high > availability storage. All of a sudden, the job managers start

Re: Emit intermediate accumulator state of a session window

2019-12-05 Thread Andrey Zagrebin
aggregated value. If you still need to do something with the result of windowing, you could do it as now or simulate it with timers [2] in that same stateful function. Best, Andrey [1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/stream/state/state.html#using-managed-keyed-state [2

[PROPOSAL/SURVEY] Enable background cleanup for state with TTL by default

2019-11-27 Thread Andrey Zagrebin
merge it next week and include into 1.10. Thanks, Andrey

Re: ProcessFunction Timer

2019-10-18 Thread Andrey Zagrebin
similar way. Best, Andrey On Thu, Oct 17, 2019 at 11:10 PM Navneeth Krishnan wrote: > Hi All, > > I'm currently using a tumbling window of 5 seconds using > TumblingTimeWindow but due to change in requirements I would not have to > window every incoming data. With that said

Re: about Kafka sink and 2PC function

2019-10-18 Thread Andrey Zagrebin
Hi, This is the contract of 2PC transactions. Multiple commit retries should result in only one commit which actually happens in the external system. The external system has to support deduplication of committed transactions, e.g. by some unique id. Best, Andrey > On 10 Oct 2019, at 07

Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-19 Thread Andrey Zagrebin
Hi Everybody! Thanks a lot for the warn welcome! I am really happy about joining Flink committer team and hope to help the project to grow more. Cheers, Andrey On Fri, Aug 16, 2019 at 11:10 AM Terry Wang wrote: > Congratulations Andrey! > > Best, > Terry Wang > > > > 在

Re: Error while running flink job on local environment

2019-07-30 Thread Andrey Zagrebin
/browse/FLINK-12852 Best, Andrey On Tue, Jul 30, 2019 at 5:42 PM Vinayak Magadum wrote: > Hi, > > I am using Flink version: 1.7.1 > > I have a flink job that gets the execution environment as below and > executes the job. > > Stream

Re: StreamingFileSink part file count reset

2019-07-30 Thread Andrey Zagrebin
Hi Sidhartha, This is a general limitation now because Flink does not keep counters for all buckets but only a global one. Flink assumes that the sink can write to any bucket any time and the counter is not reset to not rewrite the previously written file number 0. Best, Andrey On Tue, Jul 30

Re: Job submission timeout with no error info.

2019-07-22 Thread Andrey Zagrebin
Hi Fakrudeen, Thanks for sharing the logs. Could you also try it with Flink 1.8? Best, Andrey On Sat, Jul 20, 2019 at 12:44 AM Fakrudeen Ali Ahmed wrote: > Hi Andrey, > > > > > > Flink version: 1.4.2 > > Please find the client log attached and job manager l

Re: Job submission timeout with no error info.

2019-07-19 Thread Andrey Zagrebin
Hi Fakrudeen, which Flink version do you use? could you share full client and job manager logs? Best, Andrey On Fri, Jul 19, 2019 at 7:00 PM Fakrudeen Ali Ahmed wrote: > Hi, > > > > We are submitting a Flink topology [YARN] and it fails during upload of > the jar

Re: Consuming data from dynamoDB streams to flink

2019-07-19 Thread Andrey Zagrebin
kinesis consumer parallelism Best, Andrey [1] https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/state/savepoints.html [2] https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/connectors/kinesis.html#kinesis-consumer On Fri, Jul 19, 2019 at 1:45 PM Vinay Patil wrote: >

Re: Apache Flink - Event time and process time timers with same timestamp

2019-07-19 Thread Andrey Zagrebin
Hi, Event and processing time timers have independent state storage. You can use both independently, so I would expect two firings with different domains. `TimeCharacteristic` is for operations where you do not explicitly tell the time type, like windowing. Best, Andrey On Fri, Jul 19, 2019 at

Re: Checkpoints timing out for no apparent reason

2019-07-19 Thread Andrey Zagrebin
investigate the reason of failures if you provide JM and TM logs. Best, Andrey [1] https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/state/checkpointing.html#enabling-and-configuring-checkpointing <https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/stream/st

Re: Queryable state and TTL

2019-07-03 Thread Andrey Zagrebin
Hi Avi, It is on the road map but I am not aware about plans of any contributor to work on it for the next releases. I think the community will firstly work on the event time support for TTL. I will loop Yu in, maybe he has some plans to work on TTL for the queryable state. Best, Andrey On Wed

Re: Could not load the native RocksDB library

2019-07-03 Thread Andrey Zagrebin
Hi Samya, Additionally to Haibo's answer: Have you tried the previous 1.7 version of Flink? The Rocksdb version was upgraded in 1.8 version. Best, Andrey On Wed, Jul 3, 2019 at 5:21 AM Haibo Sun wrote: > Hi, Samya.Patro > > I guess this may be a setup problem. What OS and what

Re: Received fatal alert: certificate_unknown

2019-05-17 Thread Andrey Zagrebin
Hi Pedro, thanks for letting know! Best, Andrey On Fri, May 17, 2019 at 4:29 PM PedroMrChaves wrote: > We found the issue. > > It was using the DNSName for the certificate validation and we were > accessing via localhost. > > > > - > Best Regards, > Pedro C

Re: flink 1.4.2. java.lang.IllegalStateException: Could not initialize operator state backend

2019-05-16 Thread Andrey Zagrebin
The stack trace shows that the state is being restored which has probably already happened after job restart. I am wondering why it has been restarted after some time of running. Could you share full job/task manager logs? On Thu, May 16, 2019 at 6:26 AM anaray wrote: > Thank You Andrey. Ar

Re: Getting java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError when stopping/canceling job.

2019-05-16 Thread Andrey Zagrebin
flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:129) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 10 more > > On Wed, 15 May 2019 at 12:00, Andrey Zagrebin > wrote: > >> Hi John, >> &g

Re: Checkpoints periodically fail with hdfs as the state backend - Could not flush and close the file system output stream

2019-05-16 Thread Andrey Zagrebin
Hi, could you also post job master logs? and ideally full task manager logs. This failure can be caused by some other previous failure. Best, Andrey On Wed, May 15, 2019 at 2:48 PM PedroMrChaves wrote: > Hello, > > Every once in a while our checkpoints fail with the following

Re: Getting java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError when stopping/canceling job.

2019-05-15 Thread Andrey Zagrebin
Hi John, could you share the full stack trace or better logs? It looks like something is trying to be executed in vertx.io code after the local task has been stopped and the class loader for the user code has been unloaded. Maybe from some daemon thread pool. Best, Andrey On Wed, May 15, 2019

Re: flink 1.4.2. java.lang.IllegalStateException: Could not initialize operator state backend

2019-05-15 Thread Andrey Zagrebin
start to happen and why the operator state was restored? Job restart? Best, Andrey

Re: Table program cannot be compiled

2019-05-15 Thread Andrey Zagrebin
Hi, I am looping in Timo and Dawid to look at the problem. On Tue, May 14, 2019 at 9:12 PM shkob1 wrote: > BTW looking at past posts on this issue[1] it should have been fixed? i'm > using version 1.7.2 > Also the recommendation was to use a custom function, though that's exactly > what im doing

Re: Migrating Existing TTL State to 1.8

2019-05-15 Thread Andrey Zagrebin
, you can combine only compaction filter with full snapshotting cleanup with RocksDB backend. Best, Andrey On Fri, Mar 15, 2019 at 11:56 PM Ning Shi wrote: > Hi Stefan, > > Thank you for the confirmation. > > Doing a one time cleanup with full snapshot and upgrading to Flink 1

Re: [Discuss] Semantics of event time for state TTL

2019-04-15 Thread Andrey Zagrebin
can be tolerated or not and whether an additional stream ordering operator needs to be put before TTL state access. I would still consider TTL event time feature to be implemented as we have some user requests for it. Any further feedback is appreciated. Best, Andrey On Tue, Apr 9, 2019 at 5:26

[Discuss] Semantics of event time for state TTL

2019-04-04 Thread Andrey Zagrebin
ider pluggable for users. The interface can give users context (currently processed record, watermark etc) and ask them which timestamp to use. This is more complicated though. Looking forward for your feedback. Best, Andrey [1] https://issues.apache.org/jira/browse/FLINK-12005 [2] h

Re: How can I get the right TaskExecutor in ProcessFunction

2019-04-02 Thread Andrey Zagrebin
your ProcessFunction: RuntimeContext.getIndexOfThisSubtask out of RuntimeContext.getNumberOfParallelSubtasks. Best, Andrey On Mon, Apr 1, 2019 at 11:33 AM peibin wang wrote: > Hi all > > I want to get the right TaskExecutor where the ProcessFunction run at. > Is there any way to

Re: Async Function Not Generating Backpressure

2019-03-21 Thread Andrey Zagrebin
Hi Seed, when you create `AsyncDataStream.(un)orderedWait` which capacity do you pass in or you use the default one (100)? Best, Andrey On Thu, Mar 21, 2019 at 2:49 AM Seed Zeng wrote: > Hey Andrey and Ken, > Sorry about the late reply. I might not have been clear in my question

Re: Async Function Not Generating Backpressure

2019-03-20 Thread Andrey Zagrebin
result and Cassandra sink probably uses that thread to write data. This might parallelize and pipeline previous steps like Kafka fetching and Cassandra IO but I am also not sure about this explanation. Best, Andrey On Tue, Mar 19, 2019 at 8:05 PM Ken Krugler wrote: > Hi Seed, > > I was

Re: ingesting time for TimeCharacteristic.IngestionTime on unit test

2019-03-19 Thread Andrey Zagrebin
Hi Avi, do you use processing time timer (timerService().registerProcessingTimeTimer)? why do you need ingestion time? do you set TimeCharacteristic.IngestionTime? Best, Andrey On Tue, Mar 19, 2019 at 1:11 PM Avi Levi wrote: > Hi, > Our stream is not based on time sequence and we do n

Re: Flink 1.7.2 extremely unstable and losing jobs in prod

2019-03-19 Thread Andrey Zagrebin
Hi Bruno, could you also share the job master logs? Thanks, Andrey On Tue, Mar 19, 2019 at 12:03 PM Bruno Aranda wrote: > Hi, > > This is causing serious instability and data loss in our production > environment. Any help figuring out what's going on here would be really >

Re: End-to-end exactly-once semantics in simple streaming app

2019-03-19 Thread Andrey Zagrebin
processed again. This has though a caveat that database might have stale data between checkpoints. Once the current state is synced with database, depending on your App, it might be even cleared from Flink state. I also cc Piotr and Kostas, maybe, they have more ideas. Best, Andrey On Tue, Mar 19, 2019

Re: Async Function Not Generating Backpressure

2019-03-19 Thread Andrey Zagrebin
, AsyncFunction.asyncInvoke needs addition throttling on submitting requests to Cassandra. See also [1] for Cassandra sink with throttling. Best, Andrey [1] https://issues.apache.org/jira/browse/FLINK-9083 On Tue, Mar 19, 2019 at 3:48 AM Seed Zeng wrote: > Flink Version - 1.6.1 > > In our applic

  1   2   3   >