Hello,

On 5/21/2006 3:48 PM, Wolfgang Denk wrote:
In message <[EMAIL PROTECTED]> you wrote:

Anyway, in typical scenarios, the only reliable way way of guessing tape capacity (or remaining capacity) is estimating based on individual data, i.e. using existing tapes with similar data on them as reference.


I wouldn't call this a "reliable" way. Here is  a  list  of  40  full
SLR100 tapes (uncompressed capacity 50 GB) in my data base:
...
They were written with h/w compression turned on; as you can see  the
range  is 46.4 ... 90.8 GB; the average is 69.7 GB. So what should we
guestimate as "remaining free capacity" after writing for example  40
GB of data?

The important part of my sentence is "tapes with similar data"... I know the effect you describe, for example it's possible to put up to 20 GB of log file data (only ascii text) onto a 4 GB DDS tape, but only 3 GB of compressed movie fit onto it (both with hardware compression on). Let me do some advertising for baculareport.pl:

In the following report, notice the "Rel." percentage (Reliability; some estimate about the reliability of the average tape capacity as calculated from the full volumes of that pool).

bacula volume / pool status report 2006-05-21 16:18
Volumes Are Full, Other, Append, Empty, aWay or X (error)
>
Pool           Diff
  ############################################################----------
  |0%          |20%          |40%          |60%          |80%      100%|
  176.28GB used                                    Rel: 72% free 29.25GB
  35 F Volumes                                       1 A and 6 E Volumes

Differential Backups over several systems - hard to determine a well-based tape capacity due to wildly varying backup data (anything from text files to compressed media).

Pool           Full
  ###################################################################---
  |0%          |20%          |40%          |60%          |80%      100%|
  1.24TB used                                      Rel: 80% free 51.38GB
  58 F Volumes                                       1 A and 2 E Volumes

Basically the same problem here, but I keep more backup generations than for differential backups, so I've got a slighly better statistical basis for my capacity guessing.

Pool           Incr
  ######################################################################
  |0%          |20%          |40%          |60%          |80%      100%|
  37.16GB used                                       Rel: 76% free 1.00B
  5 F Volumes                                        1 A and 0 E Volumes

Pool            QIC
  #####################################################################-
  |0%          |20%          |40%          |60%          |80%      100%|
  21.24GB used                                    Rel: 98% free 447.89MB
  42 F Volumes                                       1 A and 0 E Volumes


Only mail files stored here, and no hardware compression on these tapes. The 98% reliability represent the - in my experience - typical differences in tape capacity between tapes of different make, different manufacturing year, and not least different numbers of file marks per tape. The raw data from the catalog look like this:

mysql> select min(VolBytes)/1024*1024 AS Minimum, avg(VolBytes)/1024/1024 AS 
Average, max(VolBytes)/1024/1024 AS Maximum, std(VolBytes)/1024/1024 AS 
Standarddeviation from Media where MediaType='DC6525' and VolStatus='Full';
# MBytes here.
+--------------+--------------+----------+-------------------+
| Minimum      | Average      | Maximum  | Standarddeviation |
+--------------+--------------+----------+-------------------+
| 514225073.00 | 513.47504393 | 535.3087 |        8.34908976 |
+--------------+--------------+----------+-------------------+


6 GB?   29 GB?   50 GB?

In your example, we'd have a low reliability of "my" guess, corresponding to a high standard deviation. In my example, the report tells me quite reliably when I've got to go shopping for new tapes. Not as detailed that I know "during job xxx the tapes will fill", but enough to see that next weeks or months backups will probably need more volumes available.

Depending on previous tape use, current file system  content,  backup
schedules I don't know in advance which file system will end on which
tape,  so  even  if I know the likely compression rate for a specific
file syste, I would not be able to put this information to use.

Well, if you'd need that information you should separate the tapes into multiple pools and store comparable datasets per pool.

Or, but this is probably much more difficult, you'd have to track the amount of data per job, determine which parts of which jobs are on which tape, and thus estimate the tape capacity. Creating the necessary catalog queries and program logic is definitely not what I will want to do :-)


Arno

Best regards,

Wolfgang Denk


--
IT-Service Lehmann                    [EMAIL PROTECTED]
Arno Lehmann                  http://www.its-lehmann.de


-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to