[ 
http://jira.dspace.org/jira/browse/DS-302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=10593#action_10593
 ] 

Tim Donohue commented on DS-302:
--------------------------------

Update to this issue...

It seems that these old "to_be_processed=false" bitstreams are *not* actually 
being processed, as they have already been cleaned from the DSpace assetstore.  
  However, they are being incorrectly marked as having been processed in the 
'most_recent_checksum' and 'checksum_history' tables, where the dates are being 
updated to the most recent month.

The problem doesn't seem to be with the DailyReportEmailer.  It has to do with 
something updating the last_processed dates for these old bitstreams in the 
checksum tables. 

I still haven't been able to track down what is causing these strange updates, 
or if it's unique to running the CheckSum Checker in the Time Duration (-d) 
mode.

> Checksum Checker re-processes bitstreams marked "to_be_processed=false" at 
> beginning of month.
> ----------------------------------------------------------------------------------------------
>
>                 Key: DS-302
>                 URL: http://jira.dspace.org/jira/browse/DS-302
>             Project: DSpace 1.x
>          Issue Type: Bug
>          Components: DSpace API
>    Affects Versions: 1.5.2
>         Environment: RHEL4, PostgreSQL 8.1.16
>            Reporter: Tim Donohue
>
> We've been using the 'org.dspace.checker.DailyReportEmailer' reporting class 
> to send us reports of CheckSum issues in DSpace.   We've been noticing that 
> after the first time the CheckSum checker runs in a given month, the email 
> report received includes a number of bitstreams (usually about 50 to 100) 
> which had already been marked as no longer to be processed 
> (to_be_processed=false).   This problem *only* occurs the first time the 
> CheckSum Checker runs in a given month.
> Checking in the 'most_recent_checksum' table, we have bitstreams which were 
> deleted over a year ago, and are flagged as "to_be_processed=false", but 
> still were processed during our most recent Checksum Checker run.
> So far, I've been unable to track down how/why the Checksum Checker is 
> continually re-processing these bitstreams (or if it's actually just 
> incorrectly updating the 'most_recent_checksum' table as though it was 
> re-processed -- as the 'last_process_*_date' fields are from the current 
> month).  So, I thought I'd log this to see if anyone else has run into it.
> Here are the Cron jobs we're running for both the CheckSumChecker and the 
> DailyReportEmailer.   As you can tell, we are currently only running each 
> once per week:
> #Schedule DSpace Checksum checker to run once a week,
> #every Sunday, running for 2 hrs max
> 0 4 * * 0 dspace/bin/checker -d2h -p
> #Send email of Checksum checker results on Sunday
> 0 6 * * 0 dspace/bin/dsrun org.dspace.checker.DailyReportEmailer -a

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://jira.dspace.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to