[ 
https://issues.apache.org/jira/browse/NIFI-5331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638731#comment-16638731
 ] 

ASF subversion and git services commented on NIFI-5331:
-------------------------------------------------------

Commit 5872eb3c4a060684a88555f1c697f07bec4c26dd in nifi's branch 
refs/heads/master from [~markap14]
[ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=5872eb3 ]

NIFI-5331: When checkpointing SequentialAccessWriteAheadLog, if the journal is 
not healthy, ensure that we roll it over and ensure that if an Exception is 
thrown when attempting to fsync() or close() the journal, we continue creating 
a new one.
This closes #2952.
Signed-off-by: Brandon Devries <[email protected]>


> SequentialAccessWriteAheadLog: poisioned journal requires restart
> -----------------------------------------------------------------
>
>                 Key: NIFI-5331
>                 URL: https://issues.apache.org/jira/browse/NIFI-5331
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.6.0
>            Reporter: Brandon DeVries
>            Assignee: Mark Payne
>            Priority: Major
>             Fix For: 1.8.0
>
>
> If using the SequentialAccessWriteAheadLog, once a journal becomes poisoned, 
> NiFi can't recover without a restart.  
> SequentialAccessWriteAheadLog uses a LengthDelimitedJournal which has a 
> "poisoned" flag[1].  This is initially set "false", but is set true if an 
> Exceptions or Throwable is encountered on a write operation[2].  Once 
> poisoned, calls to update()[3] will result in a call to checkState()[4] which 
> then throws an IOException stating, "Cannot update journal file... If the 
> repository is able to checkpoint, then this problem will resolve itself..."  
> SequentialAccessWriteAheadLog.checkpoint()[5] creates a new 
> LengthDelimitedJournal which would hypothetically have a cleared "poisoned" 
> flag .  However, before creating that new Journal, it calls 
> journal.fsync()[6], which calls checkState(), which throws the above 
> IOException if poisoned == true.  So, the FlowFileRepository enters a state 
> where it cannot be written to, and cannot recover, until the instance is 
> restarted.
>  
> [1] 
> [https://github.com/apache/nifi/blob/rel/nifi-1.6.0/nifi-commons/nifi-write-ahead-log/src/main/java/org/apache/nifi/wali/LengthDelimitedJournal.java#L70]
>  
> [2] 
> [https://github.com/apache/nifi/blob/rel/nifi-1.6.0/nifi-commons/nifi-write-ahead-log/src/main/java/org/apache/nifi/wali/LengthDelimitedJournal.java#L208]
>  
> [3] 
> [https://github.com/apache/nifi/blob/rel/nifi-1.6.0/nifi-commons/nifi-write-ahead-log/src/main/java/org/apache/nifi/wali/LengthDelimitedJournal.java#L178]
> [4] 
> [https://github.com/apache/nifi/blob/rel/nifi-1.6.0/nifi-commons/nifi-write-ahead-log/src/main/java/org/apache/nifi/wali/LengthDelimitedJournal.java#L217]
>  
> [5] 
> [https://github.com/apache/nifi/blob/rel/nifi-1.6.0/nifi-commons/nifi-write-ahead-log/src/main/java/org/apache/nifi/wali/SequentialAccessWriteAheadLog.java#L279]
>  
> [6] 
> [https://github.com/apache/nifi/blob/rel/nifi-1.6.0/nifi-commons/nifi-write-ahead-log/src/main/java/org/apache/nifi/wali/SequentialAccessWriteAheadLog.java#L259]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to