Package: bacula-console-qt
Version: 5.2.6+dfsg-9
Severity: important
Dear Maintainer,
For recovery of files by means of bRestore, it is necessary to execute at first
.bvfs_update.
When trying to build the PathHierarchy with .bvfs_update, the update for big
jobs take forever...
Steps to Reproduce:
- have a big dataset (> 1 million files AND a deep directory hierarchy)
- .bvfs_update job=N
- mysql used 100% CPU very mach time
- at this time no run jobs on the schedule
Additional Information:
I updated bacula from version 5.0 to 5.2 (debian 6 -> 7). In the new bat
version the "Browse Cataloged Files" function became deprecated. It is offered
to use bRestore. My bat hanged at bRestore choice. At this time mysql uses 100%
of Cpu
I enable slow query log on mysql.
The analysis of slow query is lower
TOP 1 from "#mysqldumpslow mysql-slow.log"
Count: 249 Time=243.29s (60579s) Lock=0.00s (0s) Rows=0.0 (0),
bacula[bacula]@localhost
INSERT INTO PathVisibility (PathId, JobId) SELECT a.PathId,N FROM ( SELECT
DISTINCT h.PPathId AS PathId FROM PathHierarchy AS h JOIN PathVisibility AS p
ON (h.PathId=p.PathId) WHERE p.JobId=N) AS a LEFT JOIN (SELECT PathId FROM
PathVisibility WHERE JobId=N) AS b ON (a.PathId = b.PathId) WHERE b.PathId IS
NULL
I carried out the analysis and optimization of this query.
Original query:
INSERT INTO PathVisibility (PathId, JobId)
SELECT a.PathId,N
FROM (
SELECT DISTINCT h.PPathId AS PathId
FROM PathHierarchy AS h
JOIN PathVisibility AS p
ON (h.PathId=p.PathId)
WHERE p.JobId=N) AS a
LEFT JOIN (
SELECT PathId
FROM PathVisibility
WHERE JobId=N) AS b
ON (a.PathId = b.PathId)
WHERE b.PathId IS NULL
Optimized query:
INSERT INTO PathVisibility (PathId, JobId)
SELECT a.PathId,N
FROM (
SELECT DISTINCT h.PPathId AS PathId
FROM PathHierarchy AS h
JOIN PathVisibility AS p
ON (h.PathId=p.PathId)
WHERE p.JobId=N) AS a
LEFT JOIN PathVisibility AS b
ON (b.JobId=N and a.PathId = b.PathId)
WHERE b.PathId IS NULL
Results of query are identical.
In original query in join on the right there is a temporary table (subquery
"SELECT PathId FROM PathVisibility WHERE JobId=N"). The temporary table has no
index across the field of PathId. For each record of the left table there is a
full search of the right table.
The table PathVisibility has a compound index on the fields JobId and PathId.
In the optimized query join is carried out with use of this index.
After introduction of editings and recompilation bacula of .bvfs_update it was
executed in 17 minutes. With old inquiry in 16-17 hours it didn't end and was
still very far from end.
Count: 19052 Time=0.01s (152s) Lock=0.00s (1s) Rows=0.0 (0),
bacula[bacula]@localhost
INSERT INTO PathVisibility (PathId, JobId) SELECT a.PathId,N FROM ( SELECT
DISTINCT h.PPathId AS PathId FROM PathHierarchy AS h JOIN PathVisibility AS p
ON (h.PathId=p.PathId) WHERE p.JobId=N) AS a LEFT JOIN PathVisibility AS b ON
(JobId=N and a.PathId = b.PathId) WHERE b.PathId IS NULL
The similar situation was stated in
http://sourceforge.net/mailarchive/forum.php?thread_name=3ccaa3a6fb746ee27273ef0b1db9045b%40bugs.bacula.org&forum_name=bacula-bugs
I have a base in mysql. So detailed check is necessary for other bases. I asked
the friend to check on postgresql for a lot of work. After 1,5 hours of
processing of one big job were compelled to stop postgresql.
Patch:
diff -Naur bacula-5.2.6+dfsg/src/cats/bvfs.c_orig
bacula-5.2.6+dfsg/src/cats/bvfs.c
--- bacula-5.2.6+dfsg/src/cats/bvfs.c_orig 2012-06-02 13:26:27.000000000
+0400
+++ bacula-5.2.6+dfsg/src/cats/bvfs.c 2014-02-06 10:42:34.798440879 +0400
@@ -404,10 +404,9 @@
"SELECT DISTINCT h.PPathId AS PathId "
"FROM PathHierarchy AS h "
"JOIN PathVisibility AS p ON (h.PathId=p.PathId) "
- "WHERE p.JobId=%s) AS a LEFT JOIN "
- "(SELECT PathId "
- "FROM PathVisibility "
- "WHERE JobId=%s) AS b ON (a.PathId = b.PathId) "
+ "WHERE p.JobId=%s) AS a "
+ "LEFT JOIN PathVisibility AS b "
+ "ON (b.JobId=%s and a.PathId = b.PathId) "
"WHERE b.PathId IS NULL", jobid, jobid, jobid);
}
-- System Information:
Debian Release: 7.3
APT prefers stable-updates
APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)
Kernel: Linux 3.2.0-4-amd64 (SMP w/4 CPU cores)
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)
Shell: /bin/sh linked to /bin/dash
Versions of packages bacula-console-qt depends on:
ii bacula-common 5.2.6+dfsg-9
ii libc6 2.13-38
ii libcap2 1:2.22-1.2
ii libgcc1 1:4.7.2-5
ii libqtcore4 4:4.8.2+dfsg-11
ii libqtgui4 4:4.8.2+dfsg-11
ii libssl1.0.0 1.0.1e-2+deb7u3
ii libstdc++6 4.7.2-5
ii libwrap0 7.6.q-24
ii zlib1g 1:1.2.7.dfsg-13
bacula-console-qt recommends no packages.
bacula-console-qt suggests no packages.
-- no debconf information
--
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]