[ https://issues.apache.org/jira/browse/SLING-12690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17936389#comment-17936389 ]
Timothee Maret commented on SLING-12690: ---------------------------------------- Potentially by sampling stack traces and deciding that the import is stuck when the commit stack trace has not changed for a configured number of samples and the thread is runnable. That would be complex though. So I think that a 3h timeout, without sampling the stacks, is probably a better first step. > Skip package if import is stuck for too long > -------------------------------------------- > > Key: SLING-12690 > URL: https://issues.apache.org/jira/browse/SLING-12690 > Project: Sling > Issue Type: Improvement > Components: Content Distribution > Reporter: Christian Schneider > Assignee: Christian Schneider > Priority: Major > Fix For: Content Distribution Journal Core 0.5.2 > > > When importing a content package we call filevault to import the package into > oak. > This is a synchronous call that blocks until the import is finished. > We have cases where this import takes much longer than expected and causes > unavailability of replication for other authors. > We should introduce a maximum time after which we consider the import to be > failed and mark the package as skipped. > So if an import takes longer than this defined time we must: > * Send out a status message to skip the package. So other pods also skip the > package > * Mark the offset of the package as processed -- This message was sent by Atlassian Jira (v8.20.10#820010)