Paul Slootman <[EMAIL PROTECTED]> writes: > please take a look at the following Debian bug report. > I've written a few comments at the end. > (Please preserve [EMAIL PROTECTED] in the CC: list when > responding, so that your responses can be tracked by the Debian BTS.) > > On Sat 25 Nov 2006, Tim Connors wrote: > > Subject: Bug#400329: wwwoffle: lock-files in concurrent downloading broken > > either way > > From: Tim Connors <[EMAIL PROTECTED]> > > To: Debian Bug Tracking System <[EMAIL PROTECTED]> > > > Package: wwwoffle > > Version: 2.9-2 > > Severity: grave > > Justification: causes non-serious data loss > > > > wwwoffle has the setting: > > # lock-files = yes | no > > # Enable the use of lock files to stop more than one WWWOFFLE > > process > > # from downloading the same URL at the same time (default=no). > > > > Either way this is set, it is broken. > > > > If set to yes, it seems that a lockfile only manages to tell the > > second process to give up loading the page at all, giving back a HTTP > > 500 WWWOFFLE Server Error: > > > > for i in `seq 1 10 ` ;do lynx -dump http://www.google.com.au & done
> > _________________________________________________________________ > > > > WWWOFFLE Server Error > > > > The WWWOFFLE server encountered a fatal error: > > > > Cannot open the spooled web page to read. > > The program cannot continue to service this request. > > _________________________________________________________________ You should not be getting this error message. You should get this error instead: _________________________________________________________________ WWWOFFLE File Locked Your request for URL <<<URL>>> is already being modified by another WWWOFFLE server. Help The page that you have requested is being modified by another server and you cannot currently access it. Reloading this page will wait for the other server to finish making modifications. To ensure that only one WWWOFFLE server modifies each cached file at a time a lock file is used. While one server is modifying the cached file the lock file exists so that other servers know about this. Until the first server has finished the cached file is not valid and cannot be accessed by another server. If you see this error message all of the time even when offline then it is possible that the lock file exists but there is no server modifying the page. This can only happen when the WWWOFFLE server that was modifying the page does not exit properly. It is important that you make sure that you allow the WWWOFFLE servers to finish and that you do not kill them or shut down the machine while they are still modifying a page. To remove the lock file it is necessary to purge the WWWOFFLE cache since this will clean up any bad lock files. _________________________________________________________________ The purpose of this page is to explain exactly what has happened. The system is not broken, but busy. If it happens all the time then it might be broken and there is even an explanation of how to solve it. This error page should not appear until a timeout of 1/6th of the socket timeout option in the configuration file (so 10 seconds for a 60 second socket timeout). The purpose of the lockfile is to stop the same file being downloaded many times. One of the key features of WWWOFFLE is the ability to reduce the number of bytes downloaded. To do this you need a method to ensure that mutliple downloads of the same file do not occur. > > If set to no, then the first process to hit the cache gets their > > download, but the second process only retrieves as much as had reached > > the first process at that time. So the second download ends up > > incomplete. No error is returned, so it may not even be apparent > > until a later date -- hence dataloss. > I've responded that it's not a grave loss of data, as that's what the > option is for; say "yes" if you want to prevent that. I agree, the lockfiles gave some people problems so I let them have no lock files at the risk of broken files. You have a choice. > That said, the error you get when a page is indeed locked, is a bit > unexpected: "This is an internal WWWOFFLE server error that should not > occur." It's after all a documented option... > IMHO the second process should wait for the completion of the first > download process before proceeding; giving an "500 WWWOFFLE Server > Error" is not the right thing to do here... There is the risk of a real bug in WWWOFFLE that causes the lockfile to not be deleted when it should be. In this case if the second process waits for the lockfile then it will wait forever. Eventually all servers are waiting for the same lockfile and nobody is ever going to delete it. This is why there is a timeout before showing the special lockfile error message. -- Andrew. ---------------------------------------------------------------------- Andrew M. Bishop [EMAIL PROTECTED] http://www.gedanken.demon.co.uk/ WWWOFFLE users page: http://www.gedanken.demon.co.uk/wwwoffle/version-2.9/user.html -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]