I'm forwarding my comment from the Jira Issue [1]:
In
https://github.com/appleyuchi/wrongcheckpoint/blob/master/src/main/scala/wordcount_increstate.scala
you set the RocksDBStateBackend, in
https://github.com/appleyuchi/wrongcheckpoint/blob/master/src/main/scala/StateWordCount.scala
you set the FsStateBackend. This will not work because the RocksDB
savepoint is not compatible.
Best,
Aljoscha
[1] https://issues.apache.org/jira/browse/FLINK-19486
On 06.10.20 11:02, ?????? wrote:
I don't know where I did change the state backends.
There are two meaning of "restarting":
??Restarting automatically(success in my experiment)
??Restarting manually(failure in my experiment)
The whole experiment(just a wordcount case) and steps are listed in my github:
https://github.com/appleyuchi/wrongcheckpoint
Could you spare some time for me to check it?
Thanks for your help~!
------------------ ???????? ------------------
??????:
"David Anderson"
<dander...@apache.org>;
????????: 2020??10??6??(??????) ????4:32
??????: "??????"<appleyu...@foxmail.com>;
????: "user"<user@flink.apache.org>;
????: Re: need help about "incremental checkpoint",Thanks
This error comes because you changed state backends. The checkpoint was written
by a different state backend. This is not supported.
To use incremental checkpoints, you must only use the RocksDbStateBackend:
first, when running the job and writing the checkpoint, and again later when
restarting.
Best,
David
On Mon, Oct 5, 2020 at 2:38 PM ?????? <appleyu...@foxmail.com> wrote:
Could you give more details?
Thanks
------------------ ???????? ------------------
??????:
"??????"
<appleyu...@foxmail.com>;
????????: 2020??10??3??(??????) ????9:30
??????: "David Anderson"<dander...@apache.org>;
????: "user"<user@flink.apache.org>;
????: ?????? need help about "incremental checkpoint",Thanks
where's the actual path?
I can only get one path from the WEB UI
Is it possible that this error happened in step 5 is due to my code's
fault?
------------------ ???????? ------------------
??????:
"??????"
<753743...@qq.com>;
????????: 2020??10??3??(??????) ????9:13
??????: "David Anderson"<dander...@apache.org>;
????: "user"<user@flink.apache.org>;
????: ?????? need help about "incremental checkpoint",Thanks
Thanks~!!
I have compared your command with mine in step 5.
Mine is:
"flink run -s
hdfs://Desktop:9000/tmp/flinkck/df6d62a43620f258155b8538ef0ddf1b/chk-22 -c StateWordCount
datastream_api-1.0-SNAPSHOT.jar"
yours is:
$ bin/flink run -s
hdfs://Desktop:9000/tmp/flinkck/1de98c1611c134d915d19ded33aeab54/chk-3 <jar
file> [args]
They are the same.
Could you tell me where am I wrong?
------------------------------------------------------------------------------------------------------------------------
Maybe the error is not caused by this command?
"Unexpected state handle type, expected:
class org.apache.flink.runtime.state.KeyGroupsStateHandle,
but found:
class org.apache.flink.runtime.state.IncrementalRemoteKeyedStateHandle"
----------------------------------------------------------------------------------------------------------------------------------
Thanks~
------------------ ???????? ------------------
??????:
"David Anderson"
<dander...@apache.org>;
????????: 2020??10??3??(??????) ????0:05
??????: "??????"<753743...@qq.com>;
????: "user"<user@flink.apache.org>;
????: Re: need help about "incremental checkpoint",Thanks
If hdfs://Desktop:9000/tmp/flinkck/1de98c1611c134d915d19ded33aeab54/chk-3 was
written by the RocksDbStateBackend, then you can use it to recover if the new
job is also using the RocksDbStateBackend. The command would be
$ bin/flink run -s
hdfs://Desktop:9000/tmp/flinkck/1de98c1611c134d915d19ded33aeab54/chk-3 <jar
file> [args]
The ":" character is meant to indicate that you should not use the literal string
"checkpointMetaDataPath", but rather replace that with the actual path. Do not include
the : character.
David
On Fri, Oct 2, 2020 at 5:58 PM ?????? <753743...@qq.com> wrote:
>
> I have read the official document
>
https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/state/checkpoints.html#directory-structure
>
> at the end of above link,it said:
>
> $ bin/flink run -s :checkpointMetaDataPath [:runArgs]
>
> I have tried the above command in previous experiment,but still no luck.
> And why the above official command has " :" after "run -s"?
> I guess " :" not necessary.
>
> Could you tell me what the right command is to recover(resume) from
incremental checkpoint(RocksdbStateBackEnd)?
>
> Much Thanks~!
>
>
> ------------------ ???????? ------------------
> ??????: "??????" <appleyu...@foxmail.com>;
> ????????: 2020??10??2??(??????) ????11:41
> ??????: "David Anderson"<dander...@apache.org>;
> ????: "user"<user@flink.apache.org>;
> ????: ?????? need help about "incremental checkpoint",Thanks
>
> Thanks for your replies~!
>
> Could you tell me what the right command is to recover from checkpoint
manually using Rocksdb file?
>
> I understand that checkpoint is for automatically recovery,
> but in this experiment I stop it by force(input 4 error in nc -lk 9999),
> Is there a way to recover from incremental checkpoint manually ( with
RocksdbStateBackend)?
>
> I can only find
hdfs://Desktop:9000/tmp/flinkck/1de98c1611c134d915d19ded33aeab54/chk-3 in my
WEB UI (I guess this is only used for fsStateBackend)
>
> Thanks for your help~!
>
> ------------------ ???????? ------------------
> ??????: "David Anderson" <dander...@apache.org>;
> ????????: 2020??10??2??(??????) ????11:24
> ??????: "??????"<appleyu...@foxmail.com>;
> ????: "user"<user@flink.apache.org>;
> ????: Re: need help about "incremental checkpoint",Thanks
>
>> Write in RocksDbStateBackend.
>> Read in FsStateBackend.
>> It's NOT a match.
>
>
> Yes, that is right. Also, this does not work:
>
> Write in FsStateBackend
> Read in RocksDbStateBackend
>
> For questions and support in Chinese, you can use the
user...@flink.apache.org. See the instructions at
https://flink.apache.org/zh/community.html for how to join the list.
>
> Best,
> David
>
> On Fri, Oct 2, 2020 at 4:45 PM ?????? <appleyu...@foxmail.com> wrote:
>>
>> Thanks for your replies~!
>>
>> My English is poor ,I have an understanding of your replies:
>>
>> Write in RocksDbStateBackend.
>> Read in FsStateBackend.
>> It's NOT a match.
>> So I'm wrong in step 5?
>> Is my above understanding right?
>>
>> Thanks for your help.
>>
>> ------------------ ???????? ------------------
>> ??????: "David Anderson" <dander...@apache.org>;
>> ????????: 2020??10??2??(??????) ????10:35
>> ??????: "??????"<appleyu...@foxmail.com>;
>> ????: "user"<user@flink.apache.org>;
>> ????: Re: need help about "incremental checkpoint",Thanks
>>
>> It looks like you were trying to resume from a checkpoint taken with
the FsStateBackend into a revised version of the job that uses the RocksDbStateBackend.
Switching state backends in this way is not supported: checkpoints and savepoints are
written in a state-backend-specific format, and can only be read by the same backend
that wrote them.
>>
>> It is possible, however, to migrate between state backends using the
State Processor API [1].
>>
>> [1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/libs/state_processor_api.html
>>
>> Best,
>> David
>>
>> On Fri, Oct 2, 2020 at 4:07 PM ?????? <appleyu...@foxmail.com>
wrote:
>>>
>>> I want to do an experiment of"incremental checkpoint"
>>>
>>> my code is:
>>>
>>> https://paste.ubuntu.com/p/DpTyQKq6Vk/
>>>
>>>
>>>
>>> pom.xml is:
>>>
>>> <?xml version="1.0" encoding="UTF-8"?>
>>> <project xmlns="http://maven.apache.org/POM/4.0.0"
>>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>>> xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd">
>>> <modelVersion>4.0.0</modelVersion>
>>>
>>> <groupId>example</groupId>
>>> <artifactId>datastream_api</artifactId>
>>> <version>1.0-SNAPSHOT</version>
>>> <build>
>>> <plugins>
>>> <plugin>
>>> <groupId>org.apache.maven.plugins</groupId>
>>> <artifactId>maven-compiler-plugin</artifactId>
>>> <version>3.1</version>
>>> <configuration>
>>> <source>1.8</source>
>>> <target>1.8</target>
>>> </configuration>
>>> </plugin>
>>>
>>> <plugin>
>>> <groupId>org.scala-tools</groupId>
>>> <artifactId>maven-scala-plugin</artifactId>
>>> <version>2.15.2</version>
>>> <executions>
>>> <execution>
>>> <goals>
>>> <goal>compile</goal>
>>> <goal>testCompile</goal>
>>> </goals>
>>> </execution>
>>> </executions>
>>> </plugin>
>>>
>>>
>>>
>>> </plugins>
>>> </build>
>>>
>>> <dependencies>
>>>
>>> <!--
https://mvnrepository.com/artifact/org.apache.flink/flink-streaming-scala -->
>>> <dependency>
>>> <groupId>org.apache.flink</groupId>
>>> <artifactId>flink-streaming-scala_2.11</artifactId>
>>> <version>1.11.1</version>
>>> <!-<scope>provided</scope>->
>>> </dependency>
>>>
>>> <!-<dependency>->
>>> <!-<groupId>org.apache.flink</groupId>->
>>> <!-<artifactId>flink-streaming-java_2.12</artifactId>->
>>> <!-<version>1.11.1</version>->
>>> <!-<!?C<scope>compile</scope>?C>->
>>> <!-</dependency>->
>>>
>>> <dependency>
>>> <groupId>org.apache.flink</groupId>
>>> <artifactId>flink-clients_2.11</artifactId>
>>> <version>1.11.1</version>
>>> </dependency>
>>>
>>>
>>>
>>> <dependency>
>>> <groupId>org.apache.flink</groupId>
>>> <artifactId>flink-statebackend-rocksdb_2.11</artifactId>
>>> <version>1.11.2</version>
>>> <!-<scope>test</scope>->
>>> </dependency>
>>>
>>> <dependency>
>>> <groupId>org.apache.hadoop</groupId>
>>> <artifactId>hadoop-client</artifactId>
>>> <version>3.3.0</version>
>>> </dependency>
>>>
>>> <dependency>
>>> <groupId>org.apache.flink</groupId>
>>> <artifactId>flink-core</artifactId>
>>> <version>1.11.1</version>
>>> </dependency>
>>>
>>> <!-<dependency>->
>>> <!-<groupId>org.slf4j</groupId>->
>>> <!-<artifactId>slf4j-simple</artifactId>->
>>> <!-<version>1.7.25</version>->
>>> <!-<scope>compile</scope>->
>>> <!-</dependency>->
>>>
>>> <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-cep
-->
>>> <dependency>
>>> <groupId>org.apache.flink</groupId>
>>> <artifactId>flink-cep_2.11</artifactId>
>>> <version>1.11.1</version>
>>> </dependency>
>>>
>>> <dependency>
>>> <groupId>org.apache.flink</groupId>
>>> <artifactId>flink-cep-scala_2.11</artifactId>
>>> <version>1.11.1</version>
>>> </dependency>
>>>
>>> <dependency>
>>> <groupId>org.apache.flink</groupId>
>>> <artifactId>flink-scala_2.11</artifactId>
>>> <version>1.11.1</version>
>>> </dependency>
>>>
>>>
>>>
>>> <dependency>
>>> <groupId>org.projectlombok</groupId>
>>> <artifactId>lombok</artifactId>
>>> <version>1.18.4</version>
>>> <!-<scope>provided</scope>->
>>> </dependency>
>>>
>>> </dependencies>
>>> </project>
>>>
>>>
>>>
>>> the error I got is:
>>>
>>> https://paste.ubuntu.com/p/49HRYXFzR2/
>>>
>>>
>>>
>>> some of the above error is:
>>>
>>> Caused by: java.lang.IllegalStateException: Unexpected state
handle type, expected: class org.apache.flink.runtime.state.KeyGroupsStateHandle, but
found: class org.apache.flink.runtime.state.IncrementalRemoteKeyedStateHandle
>>>
>>>
>>>
>>>
>>>
>>> The steps are:
>>>
>>> 1.mvn clean scala:compile compile package
>>>
>>> 2.nc -lk 9999
>>>
>>> 3.flink run -c wordcount_increstate
datastream_api-1.0-SNAPSHOT.jar
>>> Job has been submitted with JobID df6d62a43620f258155b8538ef0ddf1b
>>>
>>> 4.input the following conents in nc -lk 9999
>>>
>>> before
>>> error
>>> error
>>> error
>>> error
>>>
>>> 5.
>>>
>>> flink run -s
hdfs://Desktop:9000/tmp/flinkck/df6d62a43620f258155b8538ef0ddf1b/chk-22 -c StateWordCount
datastream_api-1.0-SNAPSHOT.jar
>>>
>>> Then the above error happens.
>>>
>>>
>>>
>>> Please help,Thanks~!
>>>
>>>
>>> I have tried to subscried to user@flink.apache.org;
>>>
>>> but no replies.If possible ,send to appleyu...@foxmail.com with
your valuable replies,thanks.
>>>
>>>