This is pretty much what I do as well. A huge advantage in doing it this way, at the filespace level, is that it is MUCH faster than counting individual files. We have about 1 billion files divided between 3 TSM servers, compared to only several thousand filespaces.
I call it CHURN rather than ABUSE. Some servers and some applications are going to have a high churn ratio (occupancy/filespace), with a lot of inactive versions as a rule, such as email servers. The ratio can be very high (5+) and still be reasonable for an email server's mail storage filespace. Our email servers are in a management class where the copy groups have a higher RETEXTRA setting, so when a user calls and says "Thunderbird ate my inbox! It happened last week. Help!" I can restore it for them from the inactive versions. OTOH, the ratio for a typical Windows PC client workstation should be around 0.6 to 0.8, and for a relatively static file server should be close to 1.0. I study these numbers a lot, and the sample numbers Maurice has below are in the same range that I see. I only get suspicious when ((occupancy / filespace) > (RETEXTRA / 2)). YMMV. I may use your SELECT as being much simpler than what I do to arrive at the same number. Thanks, Maurice van 't Loo! Roger Deschner University of Illinois at Chicago rog...@uic.edu ======I have not lost my mind -- it is backed up on tape somewhere.===== On Thu, 28 Apr 2016, Maurice van 't Loo wrote: >Hello Gary, > >Just guessing the actual reason, it might be that they want to know the >amount of TSM storage compared with the amount of storage on the clients. >While counting the number of objects doesn't give you much information, >maybe it's best to compare the filespaces with occupancy. That is what I >call the "abuse factor" > >NODE >FS >FS_GB OCC_GB ABUSE >------------------------------- >------------------------------------------------------------- >----------------- --------------- ----------------- >XXXXXXXXXXXXXXXXXXXXXX >/csminstall/AIX/images >35.06 149.36 4.2 >XXXXXXXXXXXXXXXXXXX >/nim/aix54 >39.13 39.15 1.0 >XXXXXXXXXXXXXXXX >/csminstall/AIX/aix610 >37.07 36.99 0.9 >XXXXXXXXXXXXXXXX >/home/dvpt >14.07 35.39 2.5 >XXXXXXXXXXXXX >/build >24.28 33.24 1.3 >XXXXXXXXXXXXXXXX >/db2/dvptdb >1.57 30.48 19.3 >XXXXXXXXXXXXXXXXXX >/home/dvpt >30.57 30.42 0.9 >XXXXXXXXXXXXXXXXX >/csminstall/AIX/products >25.65 25.77 1.0 >XXXXXXXXXXXXXX >/build >22.53 21.89 0.9 >XXXXXXXXXXXX >/build >22.53 21.89 0.9 > >Copy/paste this in mono-type font (notepad) to get it better readable. >I use this to hunt for missing excludes (mssql databases not excluded) but >I can also use it to calculate how much space active and inactive I roughly >have. Of course filespaces with excludes gives some mismatch. > >SQL used for above output: >select cast(substr(f.NODE_NAME,1,30) as char(30)) as >NODE,cast(substr(f.FILESPACE_NAME,1,30) as char(30)) as >FS,dec(f.CAPACITY*f.PCT_UTIL/100/1024,14,2) as >FS_GB,dec(sum(o.PHYSICAL_MB)/1024,12,2) as >OCC_GB,dec(dec(sum(o.PHYSICAL_MB),14,1)/dec(f.CAPACITY*f.PCT_UTIL/100,16,1),14,1) >as ABUSE from filespaces as f,occupancy as o where f.NODE_NAME=o.NODE_NAME >and f.FILESPACE_NAME=o.FILESPACE_NAME and f.CAPACITY>0 and f.PCT_UTIL>0 and >o.STGPOOL_NAME in (select stgpool_name from stgpools where >pooltype='PRIMARY') and o.TYPE='Bkup' group by >o.NODE_NAME,o.FILESPACE_NAME,f.NODE_NAME,f.FILESPACE_NAME,f.CAPACITY,f.PCT_UTIL >order by 4 desc fetch first 10 rows only > >For all nodes ordered by nodename: >select cast(substr(f.NODE_NAME,1,30) as char(30)) as >NODE,cast(substr(f.FILESPACE_NAME,1,30) as char(30)) as >FS,dec(f.CAPACITY*f.PCT_UTIL/100/1024,14,2) as >FS_GB,dec(sum(o.PHYSICAL_MB)/1024,12,2) as >OCC_GB,dec(dec(sum(o.PHYSICAL_MB),14,1)/dec(f.CAPACITY*f.PCT_UTIL/100,16,1),14,1) >as ABUSE from filespaces as f,occupancy as o where f.NODE_NAME=o.NODE_NAME >and f.FILESPACE_NAME=o.FILESPACE_NAME and f.CAPACITY>0 and f.PCT_UTIL>0 and >o.STGPOOL_NAME in (select stgpool_name from stgpools where >pooltype='PRIMARY') and o.TYPE='Bkup' group by >o.NODE_NAME,o.FILESPACE_NAME,f.NODE_NAME,f.FILESPACE_NAME,f.CAPACITY,f.PCT_UTIL >order by 1 > >Regards, >Maurice van 't Loo > >http://mvantloo.nl/maupack.php >Personal pack of selects (in scripts) > > >2016-04-22 16:35 GMT+02:00 Schneider, Jim <jschnei...@essendant.com>: > >> You can also start the session with -virtualnodename. It works with dsmc >> or dsmj, and avoids the need for the proxy setting. >> >> Jim Schneider >> Essendant >> >> -----Original Message----- >> From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of >> Skylar Thompson >> Sent: Friday, April 22, 2016 9:30 AM >> To: ADSM-L@VM.MARIST.EDU >> Subject: Re: [ADSM-L] SQL QUERY FOR AMOUNT OF ACTIVE VS INACTIVE DATA >> >> You can also GRANT PROXY and then use -ASNODE from one of your own nodes, >> using your node's password. I think the general node type has to match >> (i.e. any UNIX can proxy to any UNIX, but not Windows). >> >> On Fri, Apr 22, 2016 at 02:20:38PM +0000, Schneider, Jim wrote: >> > Use a server you can access and modify the nodename in the options file, >> assuming you know the password. >> > >> > Jim Schneider >> > Essendant >> > >> > -----Original Message----- >> > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf >> > Of Lee, Gary >> > Sent: Friday, April 22, 2016 9:11 AM >> > To: ADSM-L@VM.MARIST.EDU >> > Subject: Re: [ADSM-L] SQL QUERY FOR AMOUNT OF ACTIVE VS INACTIVE DATA >> > >> > Wish I could do that. This comes from three levels above me in >> management. >> > Trying to buy more storage to sell to departments. >> > Don't ask me, I have no clue what they are doing. >> > >> > I'll look into the q backup on client side, but don't have access to all >> of them. >> > >> > >> > -----Original Message----- >> > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf >> > Of Skylar Thompson >> > Sent: Friday, April 22, 2016 10:00 AM >> > To: ADSM-L@VM.MARIST.EDU >> > Subject: Re: [ADSM-L] SQL QUERY FOR AMOUNT OF ACTIVE VS INACTIVE DATA >> > >> > If you have access to the clients, you can use QUERY BACKUP and parse >> the A/I column. >> > >> > Honestly, though, when we've gotten this query, I've managed to push >> this back on the customers; it's not TSM's problem what's active or >> inactive, it's the customers' applications that are actually responsible >> for it. >> > Obviously you need a pretty good relationship with your customers to >> make that case, but in the end it's caused our customers to think more >> carefully about workflow in general. >> > >> > On Fri, Apr 22, 2016 at 01:51:20PM +0000, Lee, Gary wrote: >> > > Just got a request for the amount of active versus inactive data on >> our tsm servers. >> > > >> > > Is there a better way than traversing the backups table and summing? >> > > That would be a mighty long query. >> > > >> > > We have three servers, and approximately 300 clients about 200 tB >> total data. >> > > >> > > Thanks for any suggestions. >> > >> > -- >> > -- Skylar Thompson (skyl...@u.washington.edu) >> > -- Genome Sciences Department, System Administrator >> > -- Foege Building S046, (206)-685-7354 >> > -- University of Washington School of Medicine >> > >> > ********************************************************************** >> > Information contained in this e-mail message and in any attachments >> thereto is confidential. If you are not the intended recipient, please >> destroy this message, delete any copies held on your systems, notify the >> sender immediately, and refrain from using or disclosing all or any part of >> its content to any other person. >> >> -- >> -- Skylar Thompson (skyl...@u.washington.edu) >> -- Genome Sciences Department, System Administrator >> -- Foege Building S046, (206)-685-7354 >> -- University of Washington School of Medicine >> >