[
http://jira.dspace.org/jira/browse/DS-302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=10593#action_10593
]
Tim Donohue commented on DS-302:
--------------------------------
Update to this issue...
It seems that these old "to_be_processed=false" bitstreams are *not* actually
being processed, as they have already been cleaned from the DSpace assetstore.
However, they are being incorrectly marked as having been processed in the
'most_recent_checksum' and 'checksum_history' tables, where the dates are being
updated to the most recent month.
The problem doesn't seem to be with the DailyReportEmailer. It has to do with
something updating the last_processed dates for these old bitstreams in the
checksum tables.
I still haven't been able to track down what is causing these strange updates,
or if it's unique to running the CheckSum Checker in the Time Duration (-d)
mode.
> Checksum Checker re-processes bitstreams marked "to_be_processed=false" at
> beginning of month.
> ----------------------------------------------------------------------------------------------
>
> Key: DS-302
> URL: http://jira.dspace.org/jira/browse/DS-302
> Project: DSpace 1.x
> Issue Type: Bug
> Components: DSpace API
> Affects Versions: 1.5.2
> Environment: RHEL4, PostgreSQL 8.1.16
> Reporter: Tim Donohue
>
> We've been using the 'org.dspace.checker.DailyReportEmailer' reporting class
> to send us reports of CheckSum issues in DSpace. We've been noticing that
> after the first time the CheckSum checker runs in a given month, the email
> report received includes a number of bitstreams (usually about 50 to 100)
> which had already been marked as no longer to be processed
> (to_be_processed=false). This problem *only* occurs the first time the
> CheckSum Checker runs in a given month.
> Checking in the 'most_recent_checksum' table, we have bitstreams which were
> deleted over a year ago, and are flagged as "to_be_processed=false", but
> still were processed during our most recent Checksum Checker run.
> So far, I've been unable to track down how/why the Checksum Checker is
> continually re-processing these bitstreams (or if it's actually just
> incorrectly updating the 'most_recent_checksum' table as though it was
> re-processed -- as the 'last_process_*_date' fields are from the current
> month). So, I thought I'd log this to see if anyone else has run into it.
> Here are the Cron jobs we're running for both the CheckSumChecker and the
> DailyReportEmailer. As you can tell, we are currently only running each
> once per week:
> #Schedule DSpace Checksum checker to run once a week,
> #every Sunday, running for 2 hrs max
> 0 4 * * 0 dspace/bin/checker -d2h -p
> #Send email of Checksum checker results on Sunday
> 0 6 * * 0 dspace/bin/dsrun org.dspace.checker.DailyReportEmailer -a
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://jira.dspace.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel