>>>>> On Wed, 19 Sep 2007 18:58:13 +0200, Marc Cousin said:
> 
> On Wednesday 19 September 2007 16:59:10 Martin Simmons wrote:
> > >>>>> On Wed, 19 Sep 2007 11:54:37 +0200, Cousin Marc said:
> > >
> > > I think the problem is linked to the fact dbcheck works more or less row
> > > by row.
> > >
> > > If I understand correctly, the problem is that you have duplicates in the
> > > path table as the error comes from
> > > SELECT PathId FROM Path WHERE Path='%s' returning more than one row
> > >
> > > You could try this query, it would probably be much faster :
> > >
> > > delete from path
> > > where pathid not in (
> > >   select min(pathid) from path
> > >   where path in
> > >           (select path from path group by path having count(*) >1)
> > >   group by path)
> > > and path in (
> > >   select path from path group by path having count(*) >1);
> > >
> > > I've just done it very quickly and haven't had time to doublecheck, so
> > > make a backup before if you want to try it... :)
> > > Or at least do it in a transaction so you can rollback if anything goes
> > > wrong.
> >
> > Deleting from path like that could leave the catalog in a worse state than
> > before, with dangling references in the File table.  The dbcheck routine
> > updates the File table to replace references to deleted pathids.
> >
> > Moreover, if deleting duplicate pathids is slow (i.e. there are many of
> > them), then the catalog could be badly corrupted, so I don't see how you
> > can be sure that the File records are accurate.  It might be better to wipe
> > the catalog and start again, or at least prune all of the file records
> > before running dbcheck.
> >
> 
> You're right, I didn't think of that problem ... I just supposed that 
> the 'biggest' records would be there because of two transactions doing the 
> same thing at the same time.
> 
> Anyhow, I think we could improve dbcheck with global queries like the 
> previous 
> one (we can clean the file table beforehand too with one of this kind).

Right.


> And even more obviously, I see that as a good reason to put integrity 
> constraints, as it seems sometimes bacula puts junk in the database...
> In this example, we should have, for path, a primary key on pathid and a 
> unique not null on path. And a foreign key constraint on pathid in file ...

I think the constraints were removed because of performance problems, but
maybe that won't be so bad with the batch insert code?

__Martin

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to