Im trying to set up my Flink cluster to store checkpoints on GCS - just a
basic docker setup.
per the instructions I add the gcs jar file and also export as env variable
my credentials file path.
*GOOGLE_APPLICATION_CREDENTIALS=/opt/flink/my-keys-sa.json*
I also specify the checkpoints dir inside *flink-conf.yaml*

When I start the job though I get the following exception:
jobmanager                 | 2023-02-06 13:04:26,773 INFO
 org.apache.flink.fs.gs.GSFileSystemFactory                   [] - Creating
GSFileSystem for uri gs://flink-test-bucket/checkpoints with options
GSFileSystemOptions{writerTemporaryBucketName=Optional.empty,
writerChunkSize=Optional.empty}
jobmanager                 | 2023-02-06 13:04:27,818 INFO
 org.apache.flink.runtime.dispatcher.StandaloneDispatcher     [] - Job
81f559b146eb668c9a2c55be5d55d4a4 reached terminal state FAILED.
jobmanager                 |
org.apache.flink.runtime.client.JobInitializationException: Could not start
the JobMaster.
jobmanager                 |  at
org.apache.flink.runtime.jobmaster.DefaultJobMasterServiceProcess.lambda$new$0(DefaultJobMasterServiceProcess.java:97)
jobmanager                 |  at
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(Unknown
Source)
jobmanager                 |  at
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(Unknown
Source)
j
  streams.jar

   | Caused by: org.apache.flink.util.FlinkRuntimeException: Failed to
create checkpoint storage at checkpoint coordinator side.
jobmanager                 |  at
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.<init>(CheckpointCoordinator.java:337)
jobmanager                 |  at
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.<init>(CheckpointCoordinator.java:245)
jobmanager                 |  at
org.apache.flink.runtime.executiongraph.DefaultExecutionGraph.enableCheckpointing(DefaultExecutionGraph.java:511)
jobmanager                 |  at
org.apache.flink.runtime.executiongraph.DefaultExecutionGraphBuilder.buildGraph(DefaultExecutionGraphBuilder.java:317)
jobmanager                 |  at
org.apache.flink.runtime.scheduler.DefaultExecutionGraphFactory.createAndRestoreExecutionGraph(DefaultExecutionGraphFactory.java:156)
jobmanager                 |  at
org.apache.flink.runtime.scheduler.SchedulerBase.createAndRestoreExecutionGraph(SchedulerBase.java:361)
jobmanager                 |  at
org.apache.flink.runtime.scheduler.SchedulerBase.<init>(SchedulerBase.java:206)
jobmanager                 |  at
org.apache.flink.runtime.scheduler.DefaultScheduler.<init>(DefaultScheduler.java:134)
jobmanager                 |  at
org.apache.flink.runtime.scheduler.DefaultSchedulerFactory.createInstance(DefaultSchedulerFactory.java:152)
jobmanager                 |  at
org.apache.flink.runtime.jobmaster.DefaultSlotPoolServiceSchedulerFactory.createScheduler(DefaultSlotPoolServiceSchedulerFactory.java:119)
jobmanager                 |  at
org.apache.flink.runtime.jobmaster.JobMaster.createScheduler(JobMaster.java:369)
jobmanager                 |  at
org.apache.flink.runtime.jobmaster.JobMaster.<init>(JobMaster.java:346)
jobmanager                 |  at
org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.internalCreateJobMasterService(DefaultJobMasterServiceFactory.java:123)
jobmanager                 |  at
org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.lambda$createJobMasterService$0(DefaultJobMasterServiceFactory.java:95)
jobmanager                 |  at
org.apache.flink.util.function.FunctionUtils.lambda$uncheckedSupplier$4(FunctionUtils.java:112)
jobmanager                 |  ... 4 more
jobmanager                 | Caused by: java.io.IOException: Multiple
IOExceptions.
jobmanager                 |  at
com.google.cloud.hadoop.gcsio.GoogleCloudStorageExceptions.createCompositeException(GoogleCloudStorageExceptions.java:78)
jobmanager                 |  at
com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getItemInfos(GoogleCloudStorageImpl.java:1883)
jobmanager                 |  at
com.google.cloud.hadoop.gcsio.GoogleCloudStorageFileSystem.checkNoFilesConflictingWithDirs(GoogleCloudStorageFileSystem.java:1224)
jobmanager                 |  at
com.google.cloud.hadoop.gcsio.GoogleCloudStorageFileSystem.mkdirsInternal(GoogleCloudStorageFileSystem.java:498)
jobmanager                 |  at
com.google.cloud.hadoop.gcsio.GoogleCloudStorageFileSystem.mkdirs(GoogleCloudStorageFileSystem.java:472)
jobmanager                 |  at
com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.mkdirs(GoogleHadoopFileSystemBase.java:921)
jobmanager                 |  at
org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2388)
jobmanager                 |  at
org.apache.flink.fs.gs.org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.mkdirs(HadoopFileSystem.java:183)
jobmanager                 |  at
org.apache.flink.core.fs.PluginFileSystemFactory$ClassLoaderFixingFileSystem.mkdirs(PluginFileSystemFactory.java:162)
jobmanager                 |  at
org.apache.flink.runtime.state.filesystem.FsCheckpointStorageAccess.initializeBaseLocationsForCheckpoint(FsCheckpointStorageAccess.java:116)
jobmanager                 |  at
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.<init>(CheckpointCoordinator.java:333)
jobmanager                 |  ... 18 more
jobmanager                 |  Suppressed: java.io.IOException: Error
getting
'gs://flink-test-bucket/checkpoints/81f559b146eb668c9a2c55be5d55d4a4/shared'
object
jobmanager                 |     at
com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl$9.onFailure(GoogleCloudStorageImpl.java:1873)
jobmanager                 |     at
com.google.cloud.hadoop.gcsio.BatchHelper.execute(BatchHelper.java:184)
jobmanager                 |     at
com.google.cloud.hadoop.gcsio.BatchHelper.lambda$queue$0(BatchHelper.java:164)
jobmanager                 |     at
java.base/java.util.concurrent.FutureTask.run(Unknown Source)
jobmanager                 |     ... 3 more
jobmanager                 |  Caused by:
com.google.api.client.googleapis.json.GoogleJsonResponseException
jobmanager                 |     at
com.google.cloud.hadoop.gcsio.GoogleCloudStorageExceptions.createJsonResponseException(GoogleCloudStorageExceptions.java:89)
jobmanager                 |     at
com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl$9.onFailure(GoogleCloudStorageImpl.java:1865)
jobmanager                 |     ... 6 more
jobmanager                 |  Suppressed: java.io.IOException: Error
getting
'gs://flink-test-bucket/checkpoints/81f559b146eb668c9a2c55be5d55d4a4' object
jobmanager                 |     at
com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl$9.onFailure(GoogleCloudStorageImpl.java:1873)
jobmanager                 |     at
com.google.cloud.hadoop.gcsio.BatchHelper.execute(BatchHelper.java:184)
jobmanager                 |     at
com.google.cloud.hadoop.gcsio.BatchHelper.lambda$queue$0(BatchHelper.java:164)
jobmanager                 |     at
java.base/java.util.concurrent.FutureTask.run(Unknown Source)
jobmanager                 |     ... 3 more
jobmanager                 |  Caused by:
com.google.api.client.googleapis.json.GoogleJsonResponseException
jobmanager                 |     at
com.google.cloud.hadoop.gcsio.GoogleCloudStorageExceptions.createJsonResponseException(GoogleCloudStorageExceptions.java:89)
jobmanager                 |     at
com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl$9.onFailure(GoogleCloudStorageImpl.java:1865)
jobmanager                 |     ... 6 more
jobmanager                 |  Suppressed: java.io.IOException: Error
getting 'gs://flink-test-bucket/checkpoints' object
jobmanager                 |     at
com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl$9.onFailure(GoogleCloudStorageImpl.java:1873)
jobmanager                 |     at
com.google.cloud.hadoop.gcsio.BatchHelper.execute(BatchHelper.java:184)
jobmanager                 |     at
com.google.cloud.hadoop.gcsio.BatchHelper.lambda$queue$0(BatchHelper.java:164)
jobmanager                 |     at
java.base/java.util.concurrent.FutureTask.run(Unknown Source)
jobmanager                 |     ... 3 more
jobmanager                 |  Caused by:
com.google.api.client.googleapis.json.GoogleJsonResponseException
jobmanager                 |     at
com.google.cloud.hadoop.gcsio.GoogleCloudStorageExceptions.createJsonResponseException(GoogleCloudStorageExceptions.java:89)
jobmanager                 |     at
com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl$9.onFailure(GoogleCloudStorageImpl.java:1865)
jobmanager                 |     ... 6 more
jobmanager                 | 2023-02-06 13:04:27,830 INFO
 org.apache.flink.runtime.dispatcher.StandaloneDispatcher     [] - Job
81f559b146eb668c9a2c55be5d55d4a4 has been registered for cleanup in the
JobResultStore after reaching a terminal state.



*The bucket is empty in any case*
Any pointers that can help further troubleshoot this?

Thanks

Reply via email to