Hello,
On 5/21/2006 3:48 PM, Wolfgang Denk wrote:
In message <[EMAIL PROTECTED]> you wrote:
Anyway, in typical scenarios, the only reliable way way of guessing tape
capacity (or remaining capacity) is estimating based on individual data,
i.e. using existing tapes with similar data on them as reference.
I wouldn't call this a "reliable" way. Here is a list of 40 full
SLR100 tapes (uncompressed capacity 50 GB) in my data base:
...
They were written with h/w compression turned on; as you can see the
range is 46.4 ... 90.8 GB; the average is 69.7 GB. So what should we
guestimate as "remaining free capacity" after writing for example 40
GB of data?
The important part of my sentence is "tapes with similar data"... I know
the effect you describe, for example it's possible to put up to 20 GB of
log file data (only ascii text) onto a 4 GB DDS tape, but only 3 GB of
compressed movie fit onto it (both with hardware compression on). Let me
do some advertising for baculareport.pl:
In the following report, notice the "Rel." percentage (Reliability; some
estimate about the reliability of the average tape capacity as
calculated from the full volumes of that pool).
bacula volume / pool status report 2006-05-21 16:18
Volumes Are Full, Other, Append, Empty, aWay or X (error)
>
Pool Diff
############################################################----------
|0% |20% |40% |60% |80% 100%|
176.28GB used Rel: 72% free 29.25GB
35 F Volumes 1 A and 6 E Volumes
Differential Backups over several systems - hard to determine a
well-based tape capacity due to wildly varying backup data (anything
from text files to compressed media).
Pool Full
###################################################################---
|0% |20% |40% |60% |80% 100%|
1.24TB used Rel: 80% free 51.38GB
58 F Volumes 1 A and 2 E Volumes
Basically the same problem here, but I keep more backup generations than
for differential backups, so I've got a slighly better statistical basis
for my capacity guessing.
Pool Incr
######################################################################
|0% |20% |40% |60% |80% 100%|
37.16GB used Rel: 76% free 1.00B
5 F Volumes 1 A and 0 E Volumes
Pool QIC
#####################################################################-
|0% |20% |40% |60% |80% 100%|
21.24GB used Rel: 98% free 447.89MB
42 F Volumes 1 A and 0 E Volumes
Only mail files stored here, and no hardware compression on these tapes.
The 98% reliability represent the - in my experience - typical
differences in tape capacity between tapes of different make, different
manufacturing year, and not least different numbers of file marks per
tape. The raw data from the catalog look like this:
mysql> select min(VolBytes)/1024*1024 AS Minimum, avg(VolBytes)/1024/1024 AS
Average, max(VolBytes)/1024/1024 AS Maximum, std(VolBytes)/1024/1024 AS
Standarddeviation from Media where MediaType='DC6525' and VolStatus='Full';
# MBytes here.
+--------------+--------------+----------+-------------------+
| Minimum | Average | Maximum | Standarddeviation |
+--------------+--------------+----------+-------------------+
| 514225073.00 | 513.47504393 | 535.3087 | 8.34908976 |
+--------------+--------------+----------+-------------------+
6 GB? 29 GB? 50 GB?
In your example, we'd have a low reliability of "my" guess,
corresponding to a high standard deviation. In my example, the report
tells me quite reliably when I've got to go shopping for new tapes. Not
as detailed that I know "during job xxx the tapes will fill", but enough
to see that next weeks or months backups will probably need more volumes
available.
Depending on previous tape use, current file system content, backup
schedules I don't know in advance which file system will end on which
tape, so even if I know the likely compression rate for a specific
file syste, I would not be able to put this information to use.
Well, if you'd need that information you should separate the tapes into
multiple pools and store comparable datasets per pool.
Or, but this is probably much more difficult, you'd have to track the
amount of data per job, determine which parts of which jobs are on which
tape, and thus estimate the tape capacity. Creating the necessary
catalog queries and program logic is definitely not what I will want to
do :-)
Arno
Best regards,
Wolfgang Denk
--
IT-Service Lehmann [EMAIL PROTECTED]
Arno Lehmann http://www.its-lehmann.de
-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users