[jira] [Created] (FLINK-23987) Move Powered By section into separate page

2021-08-26 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-23987:


 Summary: Move Powered By section into separate page
 Key: FLINK-23987
 URL: https://issues.apache.org/jira/browse/FLINK-23987
 Project: Flink
  Issue Type: Technical Debt
  Components: Project Website
Reporter: Chesnay Schepler


The PMC was just informed that it is not allowed to have a Powered By section 
on the main homepage. We need to move it into a dedicated page.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23988) Test corner cases

2021-08-26 Thread Piotr Nowojski (Jira)
Piotr Nowojski created FLINK-23988:
--

 Summary: Test corner cases
 Key: FLINK-23988
 URL: https://issues.apache.org/jira/browse/FLINK-23988
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Network
Reporter: Piotr Nowojski


Check how debloating behaves in case of:
* data skew
* multiple/two/union inputs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23989) Flink SQL visibility

2021-08-26 Thread wangzhihao (Jira)
wangzhihao created FLINK-23989:
--

 Summary: Flink SQL visibility 
 Key: FLINK-23989
 URL: https://issues.apache.org/jira/browse/FLINK-23989
 Project: Flink
  Issue Type: New Feature
  Components: Runtime / Queryable State
Reporter: wangzhihao


It’s desired to inspect into the internal states generated by SQL especially 
for debug purpose. We propose to add a new feature in Flink UI console, when 
the users click on a Vertex of the Job DAG. We shall list all states the vertex 
has in a separate panel (let’s call *states panel*). On this panel users can 
query the states with some keys. The returned value share be a human readable 
string instead of opaque binary.

 

Particularly, we need expose the states as queryable. But currently the user 
experience of queryable states is cumbersome. Only the serialized value is 
returned to client and users need to handle deserialization by themselves. 
What’s worse, the client need to construct the serializer and type manually. To 
improve this situation. We propose:
 # Have a new API to find all queryable states associated to a job vertex. This 
can be done to check against the KvStateLocationRegistry, which store the 
mapping between JobVertexId and states.
 # Have a new API to allow users get the types of queryable states: For a 
register name (String), Queryable Server will return the type of key and value 
([LogicalType|https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/LogicalType.java]).
 # To generate human readable string with API in step 2, we can 1) generate 
[TypeSerializer|https://github.com/apache/flink/blob/6c9818323b41a84137c52822d2993df788dbc9bb/flink-core/src/main/java/org/apache/flink/api/common/typeutils/TypeSerializer.java]
 from the LogicalType, so as to handle Serde automatically. 2) to convert 
internal data structures to external data structures to generate printable 
string. (with converters 
[DataStructureConverter|https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/table/data/conversion/DataStructureConverter.html]
 )

With all these steps and some modifications to Web UI/ Rest, we can enable 
users to query SQL internal states.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23990) Replace custom monaco editor component

2021-08-26 Thread Jira
Ingo Bürk created FLINK-23990:
-

 Summary: Replace custom monaco editor component
 Key: FLINK-23990
 URL: https://issues.apache.org/jira/browse/FLINK-23990
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Affects Versions: 1.14.0
Reporter: Ingo Bürk


After the upgrade to Angular 12 we should investigate if we can't replace the 
custom flink-monaco-editor component by the one shipped with ng-zorro. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23991) Specifying yarn.staging-dir fail when staging scheme is different from default fs scheme

2021-08-26 Thread Junfan Zhang (Jira)
Junfan Zhang created FLINK-23991:


 Summary: Specifying yarn.staging-dir fail when staging scheme is 
different from default fs scheme
 Key: FLINK-23991
 URL: https://issues.apache.org/jira/browse/FLINK-23991
 Project: Flink
  Issue Type: Bug
  Components: Deployment / YARN
Affects Versions: 1.13.2
Reporter: Junfan Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23992) Update doc version in release-1.13 to 1.13.2

2021-08-26 Thread Yun Tang (Jira)
Yun Tang created FLINK-23992:


 Summary: Update doc version in release-1.13 to 1.13.2
 Key: FLINK-23992
 URL: https://issues.apache.org/jira/browse/FLINK-23992
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Reporter: Yun Tang
 Fix For: 1.13.2


Current version in doc of  branch release-1.13 is still 1.13.0 and we should 
fix it to 1.13.2



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23993) Describe eventually-consistency of materializing upserts

2021-08-26 Thread Nico Kruber (Jira)
Nico Kruber created FLINK-23993:
---

 Summary: Describe eventually-consistency of materializing upserts
 Key: FLINK-23993
 URL: https://issues.apache.org/jira/browse/FLINK-23993
 Project: Flink
  Issue Type: Sub-task
  Components: Documentation, Table SQL / Ecosystem
Affects Versions: 1.14.0
Reporter: Nico Kruber


FLINK-20374 added an upsert materialization operator which fixes the order of 
shuffled streams. The results of this operator are actually _eventually 
consistent_ (it collects the latest value it has seen and redacts older 
versions when these are not valid anymore). You could see a result stream like 
this, based on the order the materialization receives events:

+I10, -I10, +I5, -I5, +I10, -I10, +I3, -I3, +I10

Each time, the value stored in Kafka would change until the "final" result is 
in.

 

It may be acceptable for upsert sinks, but should be documented (or 
changed/fixed) nonetheless.
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23994) ArrayDataSerializer and MapDataSerializer doesn't handle correctly for Null values

2021-08-26 Thread Dian Fu (Jira)
Dian Fu created FLINK-23994:
---

 Summary: ArrayDataSerializer and MapDataSerializer doesn't handle 
correctly for Null values
 Key: FLINK-23994
 URL: https://issues.apache.org/jira/browse/FLINK-23994
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.13.0, 1.12.0, 1.11.0, 1.10.0, 1.14.0
Reporter: Dian Fu
Assignee: Dian Fu
 Fix For: 1.14.0, 1.12.6, 1.13.3






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23995) Hive dialect: `insert overwrite table partition if not exists` will throw exception when tablename is like 'database.table'

2021-08-26 Thread luoyuxia (Jira)
luoyuxia created FLINK-23995:


 Summary: Hive dialect:  `insert overwrite  table  partition if not 
exists` will throw exception when tablename is like 'database.table' 
 Key: FLINK-23995
 URL: https://issues.apache.org/jira/browse/FLINK-23995
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Hive
Reporter: luoyuxia


when run such hive sql

 
{code:java}
insert overwrite table default.dest2 partition (p1=1,p2='static') if not exists 
select x from src 
{code}
it will throw exception
{code:java}
Caused by: org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not 
found default{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23996) Make Scala 2.12 default build target

2021-08-26 Thread Danny Cranmer (Jira)
Danny Cranmer created FLINK-23996:
-

 Summary: Make Scala 2.12 default build target
 Key: FLINK-23996
 URL: https://issues.apache.org/jira/browse/FLINK-23996
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Reporter: Danny Cranmer
 Fix For: 1.15.0


*Background*

Flink currently supports Scala 2.11 and 2.12. There is a plan to [drop Scala 
2.11 support|https://issues.apache.org/jira/browse/FLINK-20845], however in the 
meantime, 2.11 is the default Scala version.

*What*

Make Scala 2.12 the default version.

*How*

- Update {{pom.xml}} files to default to Scala 2.12
- Update documentation to reflect this and to instruct how to build Scala 2.11
- Update CI scripts 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Scala 2.12 End to End Tests

2021-08-26 Thread Danny Cranmer
I have raised the Jira [1]. If no one picks this up I will try to find some
time next week to make a start.

Thanks.

[1] https://issues.apache.org/jira/browse/FLINK-23996

On Tue, Aug 24, 2021 at 11:37 AM Till Rohrmann  wrote:

> Great, thanks Danny.
>
> On Mon, Aug 23, 2021 at 9:41 PM Danny Cranmer 
> wrote:
>
> > Thanks Till!
> >
> > +1 making Scala 2.12 the default. I will raise a Jira tomorrow, not at
> the
> > laptop today.
> >
> > On Mon, 23 Aug 2021, 09:43 Martijn Visser, 
> wrote:
> >
> > > +1 to switch to 2.12 by default
> > >
> > > On Sat, 21 Aug 2021 at 13:43, Chesnay Schepler 
> > wrote:
> > >
> > > > I'm with Till that we should switch to 2.12 by default .
> > > >
> > > > On 21/08/2021 11:12, Till Rohrmann wrote:
> > > > > Hi Danny,
> > > > >
> > > > > I think in the nightly builds we do run the e2e with Scala 2.12
> [1].
> > > The
> > > > > way it is configured is via the PROFILE env variable if I am not
> > > > mistaken.
> > > > >
> > > > > Independent of this we might wanna start a discussion whether we
> > don't
> > > > want
> > > > > to switch to Scala 2.12. per default.
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22579&view=logs&j=08866332-78f7-59e4-4f7e-49a56faa3179
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Fri, Aug 20, 2021 at 10:59 PM Danny Cranmer <
> > > dannycran...@apache.org>
> > > > > wrote:
> > > > >
> > > > >> Hello,
> > > > >>
> > > > >> I am working on reviewing a PR [1] to add JSON support for the AWS
> > > Glue
> > > > >> Schema Registry integration. The module only builds for Scala 2.12
> > due
> > > > to a
> > > > >> dependency incompatibility. The end-to-end tests have been
> > implemented
> > > > >> using the newer Java framework rather than bash scripts. However,
> it
> > > > >> appears as though end-to-end tests are only run for Scala 2.11,
> > > > therefore
> > > > >> the new tests are not actually being exercised. Questions:
> > > > >> - Please correct me if I am wrong here, E2E java tests are only
> run
> > > for
> > > > >> Scala 2.11 [2]?
> > > > >> - Is there a reason we are not running tests for Scala 2.11 AND
> > Scala
> > > > 2.12
> > > > >> - How do we proceed? I would be inclined with number 1, unless
> there
> > > is
> > > > a
> > > > >> good reason not to, besides increase in time (they already take a
> > long
> > > > >> time)
> > > > >>1. Enable ALL Scala 2.12 tests?
> > > > >>2. Just run the new tests with Scala 2.12?
> > > > >>3. Do not run the new tests
> > > > >>
> > > > >> [1] https://github.com/apache/flink/pull/16513
> > > > >> [2]
> > > > >>
> > > > >>
> > > >
> > >
> >
> https://github.com/apache/flink/blob/master/flink-end-to-end-tests/run-nightly-tests.sh#L264
> > > > >>
> > > > >> Thanks,
> > > > >> Danny Cranmer.
> > > > >>
> > > >
> > > >
> > >
> >
>


[jira] [Created] (FLINK-23997) SQL windowing table-valued function improvement

2021-08-26 Thread JING ZHANG (Jira)
JING ZHANG created FLINK-23997:
--

 Summary: SQL windowing table-valued function improvement
 Key: FLINK-23997
 URL: https://issues.apache.org/jira/browse/FLINK-23997
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.15.0
Reporter: JING ZHANG


This is an umbrella issue for follow up issues related with windowing 
table-valued function.

FLIP-145: 
[https://cwiki.apache.org/confluence/display/FLINK/FLIP-145%3A+Support+SQL+windowing+table-valued+function#FLIP145:SupportSQLwindowingtablevaluedfunction-CumulatingWindows]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-23998) Scopt not mentioned in flink-dist NOTICE file

2021-08-26 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-23998:


 Summary: Scopt not mentioned in flink-dist NOTICE file
 Key: FLINK-23998
 URL: https://issues.apache.org/jira/browse/FLINK-23998
 Project: Flink
  Issue Type: Bug
  Components: Build System
Affects Versions: 1.14.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.14.0


com.github.scopt:scopt_2.11:3.5.0 is still bundled by flink-dist because of the 
scala-shell, but is not mentioned in the NOTICE file.

We should add it, and check why the notice check did not catch it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Apache Flink Stateful Functions 3.1.0, release candidate #1

2021-08-26 Thread Arvid Heise
+1 (binding)

- Built from downloaded sources with Java 8 (mvn install -Prun-e2e-tests)
- Verified signatures and hashes

Best,

Arvid

On Thu, Aug 26, 2021 at 8:32 AM Tzu-Li (Gordon) Tai 
wrote:

> +1 (binding)
>
> - Built from source with Java 11 and Java 8 (mvn clean install
> -Prun-e2e-tests)
> - verified signatures and hashes
> - verified NOTICE files of Maven artifacts properly list actual bundled
> dependencies
> - Ran GoLang greeter and showcase with the proposed Dockerfiles for 3.1.0
> - Ran a local smoke E2E against the Java SDK, with adjusted parameters to
> run for a longer period of time
>
> Thanks for driving the release Igal!
>
> Cheers,
> Gordon
>
> On Thu, Aug 26, 2021 at 4:06 AM Seth Wiesman  wrote:
>
> > +1 (non-binding)
> >
> > - verified signatures and hashes
> > - Checked licenses
> > - ran mvn clean install -Prun-e2e-tests
> > - ran golang greeter and showcase from the playground [1]
> >
> > Seth
> >
> > [1] https://github.com/apache/flink-statefun-playground/pull/12
> >
> > On Wed, Aug 25, 2021 at 9:44 AM Igal Shilman  wrote:
> >
> > > +1 from my side:
> > >
> > > Here are the results of my RC2 testing:
> > >
> > > - verified the signatures and hashes
> > > - verified that the source distribution doesn't contain any binary
> files
> > > - ran mvn clean install -Prun-e2e-tests
> > > - ran Java and Python greeters from the playground [1] with the new
> > module
> > > structure, and async transport enabled.
> > > - verified that the docker image [2] builds and inspected the contents
> > > manually.
> > >
> > > Thanks,
> > > Igal
> > >
> > > [1] https://github.com/apache/flink-statefun-playground/tree/dev
> > > [2] https://github.com/apache/flink-statefun-docker/pull/15
> > >
> > >
> > > On Tue, Aug 24, 2021 at 3:34 PM Igal Shilman  wrote:
> > >
> > > > Sorry, the subject of the previous message should have said "[VOTE]
> > > Apache
> > > > Flink Stateful Functions 3.1.0, release candidate #2".
> > > >
> > > >
> > > > On Tue, Aug 24, 2021 at 3:24 PM Igal Shilman 
> wrote:
> > > >
> > > >> Hi everyone,
> > > >>
> > > >> Please review and vote on the release candidate #2 for the version
> > 3.1.0
> > > >> of Apache Flink Stateful Functions, as follows:
> > > >> [ ] +1, Approve the release
> > > >> [ ] -1, Do not approve the release (please provide specific
> comments)
> > > >>
> > > >> **Testing Guideline**
> > > >>
> > > >> You can find here [1] a page in the project wiki on instructions for
> > > >> testing.
> > > >> To cast a vote, it is not necessary to perform all listed checks,
> > > >> but please mention which checks you have performed when voting.
> > > >>
> > > >> **Release Overview**
> > > >>
> > > >> As an overview, the release consists of the following:
> > > >> a) Stateful Functions canonical source distribution, to be deployed
> to
> > > >> the release repository at dist.apache.org
> > > >> b) Stateful Functions Python SDK distributions to be deployed to
> PyPI
> > > >> c) Maven artifacts to be deployed to the Maven Central Repository
> > > >> d) New Dockerfiles for the release
> > > >> e) GoLang SDK tag statefun-sdk-go/v3.1.0-rc2
> > > >>
> > > >> **Staging Areas to Review**
> > > >>
> > > >> The staging areas containing the above mentioned artifacts are as
> > > >> follows, for your review:
> > > >> * All artifacts for a) and b) can be found in the corresponding dev
> > > >> repository at dist.apache.org [2]
> > > >> * All artifacts for c) can be found at the Apache Nexus Repository
> [3]
> > > >>
> > > >> All artifacts are signed with the key
> > > >> 73BC0A2B04ABC80BF0513382B0ED0E338D622A92 [4]
> > > >>
> > > >> Other links for your review:
> > > >> * JIRA release notes [5]
> > > >> * source code tag "release-3.1.0-rc2" [6]
> > > >> * PR for the new Dockerfiles [7]
> > > >>
> > > >> **Vote Duration**
> > > >>
> > > >> The voting time will run for at least 72 hours (since RC1). We are
> > > >> targeting this vote to last until Thursday. 26th of August, 6pm CET.
> > > >> If it is adopted by majority approval, with at least 3 PMC
> affirmative
> > > >> votes, it will be released.
> > > >>
> > > >> Thanks,
> > > >> Igal
> > > >>
> > > >> [1]
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Stateful+Functions+Release
> > > >> [2]
> > > >>
> > https://dist.apache.org/repos/dist/dev/flink/flink-statefun-3.1.0-rc2/
> > > >> [3]
> > > >>
> > https://repository.apache.org/content/repositories/orgapacheflink-1446/
> > > >> [4] https://dist.apache.org/repos/dist/release/flink/KEYS
> > > >> [5]
> > > >>
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12350038&projectId=12315522
> > > >> [6] https://github.com/apache/flink-statefun/tree/release-3.1.0-rc2
> > > >> [7] https://github.com/apache/flink-statefun-docker/pull/15
> > > >>
> > > >>
> > >
> >
>


[jira] [Created] (FLINK-23999) Support evaluating individual window table-valued function

2021-08-26 Thread JING ZHANG (Jira)
JING ZHANG created FLINK-23999:
--

 Summary: Support evaluating individual window table-valued function
 Key: FLINK-23999
 URL: https://issues.apache.org/jira/browse/FLINK-23999
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Reporter: JING ZHANG
 Fix For: 1.15.0


{{Currently, window table-valued function has to be used with other window 
operation, such as window aggregate, window topN and window join. }}

{{In the ticket, we aim to support evaluating individual window table-valued 
function.}}

{{}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24000) window aggregate support allow lateness

2021-08-26 Thread JING ZHANG (Jira)
JING ZHANG created FLINK-24000:
--

 Summary: window aggregate support allow lateness
 Key: FLINK-24000
 URL: https://issues.apache.org/jira/browse/FLINK-24000
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.15.0
Reporter: JING ZHANG


Currently, [Window 
aggregate|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sql/queries/window-agg]
 does not support allow-lateness like [Group Window 
Aggregate|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sql/queries/window-agg/#group-window-aggregation/]

We aims to support allow-lateness in this ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Apache Flink Stateful Functions 3.1.0, release candidate #1

2021-08-26 Thread Stephan Ewen
+1 (binding)

  - Build project on Java 8 (maven command line) with full end to end tests
  - Compiled and ran tests in IDE (IntelliJ) with Java 11 (possible after
some manual config, see comment below to automate this)
  - No binaries in the distribution
  - Verified license and notice files
  - Checked README


Minor issues found, none of which are release blockers:

  - Some warnings in the command line due to a missing version of a plugin.
Fixing that is good for stability (and better resilience against supply
chain attacks): See this PR for a fix:
https://github.com/apache/flink-statefun/pull/261

  - Building in IDE (IntelliJ) with Java 11 does not work out of the box,
due to issues with the Java Module System and the use of Unsafe in
generated ProtoBuf classes. See this PR for a fix:
https://github.com/apache/flink-statefun/pull/262



On Thu, Aug 26, 2021 at 8:32 AM Tzu-Li (Gordon) Tai 
wrote:

> +1 (binding)
>
> - Built from source with Java 11 and Java 8 (mvn clean install
> -Prun-e2e-tests)
> - verified signatures and hashes
> - verified NOTICE files of Maven artifacts properly list actual bundled
> dependencies
> - Ran GoLang greeter and showcase with the proposed Dockerfiles for 3.1.0
> - Ran a local smoke E2E against the Java SDK, with adjusted parameters to
> run for a longer period of time
>
> Thanks for driving the release Igal!
>
> Cheers,
> Gordon
>
> On Thu, Aug 26, 2021 at 4:06 AM Seth Wiesman  wrote:
>
> > +1 (non-binding)
> >
> > - verified signatures and hashes
> > - Checked licenses
> > - ran mvn clean install -Prun-e2e-tests
> > - ran golang greeter and showcase from the playground [1]
> >
> > Seth
> >
> > [1] https://github.com/apache/flink-statefun-playground/pull/12
> >
> > On Wed, Aug 25, 2021 at 9:44 AM Igal Shilman  wrote:
> >
> > > +1 from my side:
> > >
> > > Here are the results of my RC2 testing:
> > >
> > > - verified the signatures and hashes
> > > - verified that the source distribution doesn't contain any binary
> files
> > > - ran mvn clean install -Prun-e2e-tests
> > > - ran Java and Python greeters from the playground [1] with the new
> > module
> > > structure, and async transport enabled.
> > > - verified that the docker image [2] builds and inspected the contents
> > > manually.
> > >
> > > Thanks,
> > > Igal
> > >
> > > [1] https://github.com/apache/flink-statefun-playground/tree/dev
> > > [2] https://github.com/apache/flink-statefun-docker/pull/15
> > >
> > >
> > > On Tue, Aug 24, 2021 at 3:34 PM Igal Shilman  wrote:
> > >
> > > > Sorry, the subject of the previous message should have said "[VOTE]
> > > Apache
> > > > Flink Stateful Functions 3.1.0, release candidate #2".
> > > >
> > > >
> > > > On Tue, Aug 24, 2021 at 3:24 PM Igal Shilman 
> wrote:
> > > >
> > > >> Hi everyone,
> > > >>
> > > >> Please review and vote on the release candidate #2 for the version
> > 3.1.0
> > > >> of Apache Flink Stateful Functions, as follows:
> > > >> [ ] +1, Approve the release
> > > >> [ ] -1, Do not approve the release (please provide specific
> comments)
> > > >>
> > > >> **Testing Guideline**
> > > >>
> > > >> You can find here [1] a page in the project wiki on instructions for
> > > >> testing.
> > > >> To cast a vote, it is not necessary to perform all listed checks,
> > > >> but please mention which checks you have performed when voting.
> > > >>
> > > >> **Release Overview**
> > > >>
> > > >> As an overview, the release consists of the following:
> > > >> a) Stateful Functions canonical source distribution, to be deployed
> to
> > > >> the release repository at dist.apache.org
> > > >> b) Stateful Functions Python SDK distributions to be deployed to
> PyPI
> > > >> c) Maven artifacts to be deployed to the Maven Central Repository
> > > >> d) New Dockerfiles for the release
> > > >> e) GoLang SDK tag statefun-sdk-go/v3.1.0-rc2
> > > >>
> > > >> **Staging Areas to Review**
> > > >>
> > > >> The staging areas containing the above mentioned artifacts are as
> > > >> follows, for your review:
> > > >> * All artifacts for a) and b) can be found in the corresponding dev
> > > >> repository at dist.apache.org [2]
> > > >> * All artifacts for c) can be found at the Apache Nexus Repository
> [3]
> > > >>
> > > >> All artifacts are signed with the key
> > > >> 73BC0A2B04ABC80BF0513382B0ED0E338D622A92 [4]
> > > >>
> > > >> Other links for your review:
> > > >> * JIRA release notes [5]
> > > >> * source code tag "release-3.1.0-rc2" [6]
> > > >> * PR for the new Dockerfiles [7]
> > > >>
> > > >> **Vote Duration**
> > > >>
> > > >> The voting time will run for at least 72 hours (since RC1). We are
> > > >> targeting this vote to last until Thursday. 26th of August, 6pm CET.
> > > >> If it is adopted by majority approval, with at least 3 PMC
> affirmative
> > > >> votes, it will be released.
> > > >>
> > > >> Thanks,
> > > >> Igal
> > > >>
> > > >> [1]
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Verifying+a+Flink+Stateful+Functions+Release
> > > >> [2]
>

[jira] [Created] (FLINK-24001) Support evaluating individual window table-valued function in runtime

2021-08-26 Thread JING ZHANG (Jira)
JING ZHANG created FLINK-24001:
--

 Summary: Support evaluating individual window table-valued 
function in runtime
 Key: FLINK-24001
 URL: https://issues.apache.org/jira/browse/FLINK-24001
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Runtime
Affects Versions: 1.15.0
Reporter: JING ZHANG


{{Currently, window table-valued function has to be used with other window 
operation, such as window aggregate, window topN and window join. }}

{{In the ticket, we aim to support evaluating individual window table-valued 
function in runtime, which means, introduce an operator to handle this.}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24002) Support count window with the window TVF in planner

2021-08-26 Thread JING ZHANG (Jira)
JING ZHANG created FLINK-24002:
--

 Summary: Support count window with the window TVF in planner
 Key: FLINK-24002
 URL: https://issues.apache.org/jira/browse/FLINK-24002
 Project: Flink
  Issue Type: Sub-task
  Components: Table SQL / Planner
Affects Versions: 1.15.0
Reporter: JING ZHANG


For a long time, count window is supported in Table API, but not supported in 
SQL.

With the new window TVF syntax, we can also introduce a new window function for 
count window.

For example, the following TUMBLE_ROW assigns windows in 10 row-count interval. 
{panel}
{panel}
|{{SELECT}} {{*}}
{{FROM}} {{TABLE}}{{(}}
{{   }}{{TUMBLE_ROW(}}
{{ }}{{data => }}{{TABLE}} {{inputTable,}}
{{ }}{{timecol => DESCRIPTOR(timecol),}}
{{ }}{{size}} {{=> 10));}}|

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24004) Introduce a source/sink testing harness

2021-08-26 Thread Jira
Ingo Bürk created FLINK-24004:
-

 Summary: Introduce a source/sink testing harness
 Key: FLINK-24004
 URL: https://issues.apache.org/jira/browse/FLINK-24004
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API, Table SQL / Planner
Reporter: Ingo Bürk
Assignee: Ingo Bürk


For many tests we require some source/sink + factory with specific features 
and/or abilities. Currently, the "values" source exists for this, but has 
become quite bloated. Specifically implementing a specific set of abilities is 
very difficult to reflect there. On the other hand, implementing new factories 
for individual tests is a lot of bloat and requires a lot of factories to be 
registered.

We want to introduce a new factory (alongside the existing "values" one) which 
simply loads a user-defined class as the source to remove a bunch of that 
boilerplate.

Additional tasks:
 * Use this harness in PushProjectIntoTableSourceScanRuleTest
 * Migrate the "values" factory to the new config options class pattern so 
options can be used programmatically.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24003) Lookback mode doesn't work when mixing use of Python Table API and Python DataStream API

2021-08-26 Thread Dian Fu (Jira)
Dian Fu created FLINK-24003:
---

 Summary: Lookback mode doesn't work when mixing use of Python 
Table API and Python DataStream API
 Key: FLINK-24003
 URL: https://issues.apache.org/jira/browse/FLINK-24003
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.14.0
Reporter: Dian Fu
Assignee: Huang Xingbo
 Fix For: 1.14.0


For the following program:
{code}
import logging
import time

from pyflink.common.typeinfo import Types
from pyflink.datastream import StreamExecutionEnvironment, CoMapFunction
from pyflink.table import StreamTableEnvironment, DataTypes, Schema


def test_chaining():
env = StreamExecutionEnvironment.get_execution_environment()
t_env = StreamTableEnvironment.create(stream_execution_environment=env)

t_env.get_config().get_configuration().set_boolean("python.operator-chaining.enabled",
 False)

# 1. create source Table
t_env.execute_sql("""
CREATE TABLE datagen (
id INT,
data STRING
) WITH (
'connector' = 'datagen',
'rows-per-second' = '100',
'fields.id.kind' = 'sequence',
'fields.id.start' = '1',
'fields.id.end' = '1000'
)
""")

# 2. create sink Table
t_env.execute_sql("""
CREATE TABLE print (
id BIGINT,
data STRING,
flag STRING
) WITH (
'connector' = 'blackhole'
)
""")

t_env.execute_sql("""
CREATE TABLE print_2 (
id BIGINT,
data STRING,
flag STRING
) WITH (
'connector' = 'blackhole'
)
""")

# 3. query from source table and perform calculations
# create a Table from a Table API query:
source_table = t_env.from_path("datagen")

ds = t_env.to_append_stream(
source_table,
Types.ROW([Types.INT(), Types.STRING()]))

ds1 = ds.map(lambda i: (i[0] * i[0], i[1]))
ds2 = ds.map(lambda i: (i[0], i[1][2:]))

class MyCoMapFunction(CoMapFunction):

def map1(self, value):
print('hahah')
return value

def map2(self, value):
print('hahah')
return value

ds3 = ds1.connect(ds2).map(MyCoMapFunction(), 
output_type=Types.TUPLE([Types.LONG(), Types.STRING()]))

ds4 = ds3.map(lambda i: (i[0], i[1], "left"),
  output_type=Types.TUPLE([Types.LONG(), Types.STRING(), 
Types.STRING()]))

ds5 = ds3.map(lambda i: (i[0], i[1], "right"))\
 .map(lambda i: i,
  output_type=Types.TUPLE([Types.LONG(), Types.STRING(), 
Types.STRING()]))

schema = Schema.new_builder() \
.column("f0", DataTypes.BIGINT()) \
.column("f1", DataTypes.STRING()) \
.column("f2", DataTypes.STRING()) \
.build()

result_table_3 = t_env.from_data_stream(ds4, schema)
statement_set = t_env.create_statement_set()
statement_set.add_insert("print", result_table_3)

result_table_4 = t_env.from_data_stream(ds5, schema)
statement_set.add_insert("print_2", result_table_4)

statement_set.execute().wait()


if __name__ == "__main__":

start_ts = time.time()
test_chaining()
end_ts = time.time()
print("--- %s seconds ---" % (end_ts - start_ts))
{code}

Lookback mode doesn't work.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24005) Resource requirements declaration may be incorrect if JobMaster disconnects with a TaskManager with available slots in the SlotPool

2021-08-26 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-24005:
---

 Summary: Resource requirements declaration may be incorrect if 
JobMaster disconnects with a TaskManager with available slots in the SlotPool
 Key: FLINK-24005
 URL: https://issues.apache.org/jira/browse/FLINK-24005
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.13.2, 1.12.5, 1.14.0
Reporter: Zhu Zhu


When a TaskManager disconnects with JobMaster, it will trigger the 
`DeclarativeSlotPoolService#decreaseResourceRequirementsBy()` for all the slots 
that are registered to the JobMaster from the TaskManager. If the slots are 
still available, i.e. not assigned to any task,  the 
`decreaseResourceRequirementsBy` may lead to incorrect resource requirements 
declaration.

For example, there is one job with 3 source tasks only. It requires 3 slots and 
declares for 3 slots. Initially all the tasks are running. Suddenly one task 
failed and waits for some delay before restarting. The previous slot is 
returned to the SlotPool. Now the job requires 2 slots and declares for 2 
slots. At this moment, the TaskManager of that returned slot get lost. After 
the triggered `decreaseResourceRequirementsBy`, the job only declares for 1 
slot. Finally, when the failed task starts to re-schedule, the job will declare 
for 2 slots while it actually needs 3 slots.

The attached log of a real job and logs of the added test in 
https://github.com/zhuzhurk/flink/commit/59ca0ac5fa9c77b97c6e8a43dcc53ca8a0ad6c37
 can demonstrate this case.
Note that the real job is configured with a large 
"restart-strategy.fixed-delay.delay" and and large "slot.idle.timeout". So 
possibly in production it is a rare case.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: 【Announce】Zeppelin 0.10.0 is released, Flink on Zeppelin Improved

2021-08-26 Thread Till Rohrmann
Cool, thanks for letting us know Jeff. Hopefully, many users use Zeppelin
together with Flink.

Cheers,
Till

On Thu, Aug 26, 2021 at 4:47 AM Leonard Xu  wrote:

> Thanks Jeff for the great work !
>
> Best,
> Leonard
>
> 在 2021年8月25日,22:48,Jeff Zhang  写道:
>
> Hi Flink users,
>
> We (Zeppelin community) are very excited to announce Zeppelin 0.10.0 is
> officially released. In this version, we made several improvements on Flink
> interpreter.  Here's the main features of Flink on Zeppelin:
>
>- Support multiple versions of Flink
>- Support multiple versions of Scala
>- Support multiple languages
>- Support multiple execution modes
>- Support Hive
>- Interactive development
>- Enhancement on Flink SQL
>- Multi-tenancy
>- Rest API Support
>
> Take a look at this document for more details:
> https://zeppelin.apache.org/docs/0.10.0/interpreter/flink.html
> The quickest way to try Flink on Zeppelin is via its docker image
> https://zeppelin.apache.org/docs/0.10.0/interpreter/flink.html#play-flink-in-zeppelin-docker
>
> Besides these, here’s one blog about how to run Flink sql cookbook on
> Zeppelin,
> https://medium.com/analytics-vidhya/learn-flink-sql-the-easy-way-d9d48a95ae57
> The easy way to learn Flink Sql.
>
> Hope it would be helpful for you and welcome to join our community to
> discuss with others. http://zeppelin.apache.org/community.html
>
>
> --
> Best Regards
>
> Jeff Zhang
>
>
>


[jira] [Created] (FLINK-24006) MailboxExecutorImplTest#testIsIdle does not test the correct behaviour

2021-08-26 Thread Fabian Paul (Jira)
Fabian Paul created FLINK-24006:
---

 Summary: MailboxExecutorImplTest#testIsIdle does not test the 
correct behaviour
 Key: FLINK-24006
 URL: https://issues.apache.org/jira/browse/FLINK-24006
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Task, Tests
Affects Versions: 1.13.2, 1.12.5, 1.14.0
Reporter: Fabian Paul


The test was introduced to ensure that the mailbox idleness is still counting 
new messages although the mailbox loop might have been stopped.

 

Unfortunately, the test does not stop the mailbox processor currently which 
leads to than the test even passes without the actual code changes of 
https://issues.apache.org/jira/browse/FLINK-19109



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24007) Support Avro timestamp conversion with precision greater than three

2021-08-26 Thread Xingcan Cui (Jira)
Xingcan Cui created FLINK-24007:
---

 Summary: Support Avro timestamp conversion with precision greater 
than three
 Key: FLINK-24007
 URL: https://issues.apache.org/jira/browse/FLINK-24007
 Project: Flink
  Issue Type: Bug
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.13.2
Reporter: Xingcan Cui


{{AvroSchemaConverter.convertToSchema()}} doesn't support timestamp with 
precision > 3 now. This seems to be a bug and should be fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24008) Support state cleanup based on unique keys

2021-08-26 Thread Nico Kruber (Jira)
Nico Kruber created FLINK-24008:
---

 Summary: Support state cleanup based on unique keys
 Key: FLINK-24008
 URL: https://issues.apache.org/jira/browse/FLINK-24008
 Project: Flink
  Issue Type: New Feature
  Components: Table SQL / Runtime
Affects Versions: 1.14.0
Reporter: Nico Kruber


In a join of two tables where we join on unique columns, e.g. from primary 
keys, we could clean up join state more pro-actively since we now whether 
future joins with this row are still possible (assuming uniqueness of that 
key). While this may not solve all issues of growing state in non-time-based 
joins it may still considerably reduce state size, depending on the involved 
columns.

This would add one more way of expiring state that the operator stores; 
currently there are only these
 * time-based joins, e.g. interval join
 * idle state retention via \{{TableConfig#setIdleStateRetention()}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


How heap state backend serialization happens during snapshotting

2021-08-26 Thread Muhammad Haseeb Asif
Hello,

I have been trying to understand how serialization works for the Flink Heap 
Keyed state backend when a checkpoint happens. I want to know how Flink knows 
which type of serializer to use to serialize the values for different states 
during the snapshot.

I have debugged both RocksDB and heap keyed state backend. I can see that state 
is stored in the COW(Copy on Write) data structure which is copied. It does 
create a proxy (serializationProxy) for serialization that is used to 
serialized all the values for different state types (value, list). I cannot 
understand how the proxy is doing serialization for an individual value and the 
whole list.



Thanks




[jira] [Created] (FLINK-24009) Upgrade to Apache pom parent 24

2021-08-26 Thread Kevin Ratnasekera (Jira)
Kevin Ratnasekera created FLINK-24009:
-

 Summary: Upgrade to Apache pom parent 24
 Key: FLINK-24009
 URL: https://issues.apache.org/jira/browse/FLINK-24009
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Affects Versions: 1.13.2
Reporter: Kevin Ratnasekera






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24010) HybridSourceReader needs to forward checkpoint notifications

2021-08-26 Thread Thomas Weise (Jira)
Thomas Weise created FLINK-24010:


 Summary: HybridSourceReader needs to forward checkpoint 
notifications
 Key: FLINK-24010
 URL: https://issues.apache.org/jira/browse/FLINK-24010
 Project: Flink
  Issue Type: Bug
Reporter: Thomas Weise
Assignee: Thomas Weise
 Fix For: 1.14.0, 1.13.3


Since the reader currently swallows notifyCheckpointComplete, offset commit in 
contained Kafka consumer doesn't happen.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24011) SET 'sql-client.verbose' = 'true'; on Flink SQL client does not give full stack trace

2021-08-26 Thread James Kim (Jira)
James Kim created FLINK-24011:
-

 Summary: SET 'sql-client.verbose' = 'true'; on Flink SQL client 
does not give full stack trace
 Key: FLINK-24011
 URL: https://issues.apache.org/jira/browse/FLINK-24011
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Common, Table SQL / Client
Affects Versions: 1.12.1
 Environment: Ubuntu 18.04
Reporter: James Kim


I'm trying to have Flink SQL client to read data from S3 Compatible Storage but 
I've been getting this error

 

"[ERROR] Could not execute SQL statement. Reason:
org.apache.flink.util.SerializedThrowable: Unsupported or unrecognized SSL 
message"

But this did not help me to drop hints to start debugging so I tried SET 
'sql-client.verbose' = 'true'; but I did not get a verbose stack trace of the 
issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-24012) PulsarSourceITCase.testTaskManagerFailure fails due to NoResourceAvailableException

2021-08-26 Thread Xintong Song (Jira)
Xintong Song created FLINK-24012:


 Summary: PulsarSourceITCase.testTaskManagerFailure fails due to 
NoResourceAvailableException
 Key: FLINK-24012
 URL: https://issues.apache.org/jira/browse/FLINK-24012
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Pulsar
Affects Versions: 1.14.0
Reporter: Xintong Song
 Fix For: 1.14.0


https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22918&view=logs&j=a5ef94ef-68c2-57fd-3794-dc108ed1c495&t=2c68b137-b01d-55c9-e603-3ff3f320364b&l=24431




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] FLIP-173: Support DAG of algorithms (Flink ML)

2021-08-26 Thread Zhipeng Zhang
Thanks for the post, Dong :)

We welcome everyone to drop us an email on Flink ML. Let's work together to
build machine learning on Flink :)


Dong Lin  于2021年8月25日周三 下午8:58写道:

> Hi everyone,
>
> Based on the feedback received in the online/offline discussion in the
> past few weeks, we (Zhepeng, Fan, myself and a few other developers at
> Alibaba) have reached agreement on the design to support DAG of algorithms.
> We have merged the ideas from the intial two options into this FLIP-176
>  
> design
> doc.
>
> If you have comments on the latest design doc, please let us know!
>
> Cheers,
> Dong
>
>
> On Mon, Aug 23, 2021 at 5:07 PM Becket Qin  wrote:
>
>> Thanks for the comments, Fan. Please see the reply inline.
>>
>> On Thu, Aug 19, 2021 at 10:25 PM Fan Hong  wrote:
>>
>> > Hi, Becket,
>> >
>> > Many thanks to your detailed review. I agree that it is easier to
>> involve
>> > more people to discuss if fundamental differences are highlighted.
>> >
>> >
>> > Here are some of my thoughts to help other people to think about these
>> > differences. (correct me if those technique details are not right.)
>> >
>> >
>> > 1. One set of API or not? May be not that important.
>> >
>> >
>> > First of all, AlgoOperators and Pipeline / Transformer / Estimator in
>> > Proposal 2 are absolutely *NOT* independent.
>> >
>> >
>> > One may think they are independent, because they see Pipeline /
>> > Transformer / Estimator are already in Flink ML Lib and AlgoOperators
>> are
>> > lately added in this proposal. But that's not true. If you check
>> Alink[1]
>> > where the idea of Proposal 2 originated, both of them have been
>> presented
>> > long ago, and they collaborate tightly.
>> >
>>
>> > In the aspects of functionalities, they are also not independent. Their
>> > relation is more like a two-level API to specify ML tasks:
>> AlgoOperators is
>> > a general-purpose level to represent any ML algorithms, while Pipeline /
>> > Transformer / Estimator provides a higher-level API which enables
>> wrapping
>> > multiple ML algorithms together in a fit-transform way.
>> >
>>
>> We probably need to first clarify what "independent" means here. Sure,
>> users can always wrap the Transformer into an AlgoOperator, but users can
>> basically wrap any code, any class into an AlgoOperator. And we wouldn't
>> say AlgoOperator is not independent of any class, right? In my opinion,
>> the
>> two APIs are independent because even if we agree that Transformers are
>> doing things that are conceptually a subset of what AlgoOperators do, a
>> Transformer cannot be used as an AlgoOperator out of the box without
>> wrapping. And even worse, a MIMO AlgoOperator cannot be wrapped into a
>> Transformer / Estimator if these two APIs are SISO. So from what I see, in
>> Option 2, these two APIs are independent from API design perspective.
>>
>> One could consider Flink DataStream - Table as an analogy to AlgoOperators
>> > - Pipeline. The two-level APIs provides different functionalities to end
>> > users, and the higher-level API will call lower-level of API in internal
>> > implementation. I'm not saying the two-level API design in Proposal 2 is
>> > good because Flink already did this. I just hope to help community
>> people
>> > to understand the relation between AlgoOperators and Pipeline.
>> >
>>
>> I am not sure if it is accurate to say DataStream is a low-level API of
>> Table. They are simply two different DSL, one for relational / SQL-like
>> analytics paradigm, and the other for those who are more familiar with
>> streaming applications. More importantly, they are designed to
>> support conversion from one to the other out of the box, which is unlike
>> Pipeline and AlgoOperators in proposal 2.
>>
>>
>> > An additional usage and benefit of Pipeline API is that SISO
>> PipelineModel
>> > corresponds to a deployable unit for online serving exactly.
>> >
>> > In online serving, Flink runtime are usually avoided to achieve low
>> > latency. So models have to be wrapped for transmission from Flink
>> ecosystem
>> > to a non-Flink one. Here is the place where the wrapping is really
>> needed
>> > and inevitable, because the serving service providers are usually
>> expected
>> > to be general to one type of models. Pipeline API in Proposal 2 target
>> to
>> > this scene exactly without complicated APIs.
>> >
>> > Yet, for offline or nearline inference, they can be completed in Flink
>> > ecosystem. That's where Flink ML Lib still exists, so a loose wrapping
>> > using AlgoOperators in Proposal 2 still works with not much overhead.
>> >
>>
>> It seems that a MIMO transformer can easily support all SISO use cases,
>> right? And there is zero overhead because users may not have to wrap
>> AlgoOperators, but can just build a Pipeline directly by putting either
>> Transformer or AlgoOperators into it, without worrying about whether they
>> are interoperable.
>>
>>
>> > At the 

[jira] [Created] (FLINK-24013) Add IssueNavigationLink for IDEA

2021-08-26 Thread gaoyajun02 (Jira)
gaoyajun02 created FLINK-24013:
--

 Summary: Add IssueNavigationLink for IDEA
 Key: FLINK-24013
 URL: https://issues.apache.org/jira/browse/FLINK-24013
 Project: Flink
  Issue Type: Improvement
  Components: Build System
Affects Versions: 1.14.1
Reporter: gaoyajun02
 Fix For: 1.14.1


just like SPARK-35223



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: How heap state backend serialization happens during snapshotting

2021-08-26 Thread JING ZHANG
Hello,
Let me share my views on this issue, correct me if there is anything wrong.

To get a state handle, you have to create a StateDescriptor. This holds the
name of the state, the type of the values that the state holds and other
information. You could pass the type serializer or the type information
when specify the type for the values in the state. Type Information could
generates serializer later. Please see more information in [1].

Snapshot for all meta information about one state in a state backend would
be stored in `StateMetaInfoSnapshot` which would be used in
serializationProxy during snapshotting. You could find more information in
class `OperatorBackendSerializationProxy` and class
`KeyedBackendSerializationProxy`.

Welcome to further discussion.

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/datastream/fault-tolerance/state/

Best regards,
JING ZHANG

Muhammad Haseeb Asif  于2021年8月27日周五 上午2:41写道:

> Hello,
>
> I have been trying to understand how serialization works for the Flink
> Heap Keyed state backend when a checkpoint happens. I want to know how
> Flink knows which type of serializer to use to serialize the values for
> different states during the snapshot.
>
> I have debugged both RocksDB and heap keyed state backend. I can see that
> state is stored in the COW(Copy on Write) data structure which is copied.
> It does create a proxy (serializationProxy) for serialization that is used
> to serialized all the values for different state types (value, list). I
> cannot understand how the proxy is doing serialization for an individual
> value and the whole list.
>
>
>
> Thanks
>
>
>