[ https://issues.apache.org/jira/browse/FLINK-8910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16411473#comment-16411473 ]
ASF GitHub Bot commented on FLINK-8910: --------------------------------------- Github user kl0u commented on a diff in the pull request: https://github.com/apache/flink/pull/5676#discussion_r176753796 --- Diff: flink-end-to-end-tests/test-scripts/test_local_recovery_and_scheduling.sh --- @@ -0,0 +1,111 @@ +#!/usr/bin/env bash + +################################################################################ +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +################################################################################ + +source "$(dirname "$0")"/common.sh + +function checkLogs { + parallelism=$1 + attempts=$2 + (( expectedCount=parallelism * (attempts + 1) )) + + # Search for the log message that indicates restore problem from existing local state for the keyed backend. + failedLocalRecovery=$(grep '^.*Creating keyed state backend.* from alternative (2/2)\.$' $FLINK_DIR/log/* | wc -l | tr -d ' ') + + # Search for attempts to recover locally. + attemptLocalRecovery=$(grep '^.*Creating keyed state backend.* from alternative (1/2)\.$' $FLINK_DIR/log/* | wc -l | tr -d ' ') + + if [ ${failedLocalRecovery} -ne 0 ] + then + PASS="" + echo "FAILURE: Found ${failedLocalRecovery} failed attempt(s) for local recovery of correctly scheduled task(s)." + fi + + if [ ${attemptLocalRecovery} -eq 0 ] + then + PASS="" + echo "FAILURE: Found no attempt for local recovery. Configuration problem?" + fi +} + +function cleanupAfterTest { + # Reset the configurations + sed -i -e 's/state.backend.local-recovery: .*//' "$FLINK_DIR/conf/flink-conf.yaml" + sed -i -e 's/log4j.rootLogger=.*/log4j.rootLogger=INFO, file/' "$FLINK_DIR/conf/log4j.properties" + # + kill ${watchdogPid} 2> /dev/null + wait ${watchdogPid} 2> /dev/null + # --- End diff -- The value `watchdogPid ` is not initialized here. > Introduce automated end-to-end test for local recovery (including sticky > scheduling) > ------------------------------------------------------------------------------------ > > Key: FLINK-8910 > URL: https://issues.apache.org/jira/browse/FLINK-8910 > Project: Flink > Issue Type: Sub-task > Components: State Backends, Checkpointing > Affects Versions: 1.5.0 > Reporter: Stefan Richter > Assignee: Stefan Richter > Priority: Major > Fix For: 1.5.0 > > > We should have an automated end-to-end test that can run nightly to check > that sticky allocation and local recovery work as expected. -- This message was sent by Atlassian JIRA (v7.6.3#76005)