I've been using container pools for a while on one of our prod servers (built with container pools) and in test, but in the last few months since upgrading to 8.1.5 on all of my servers I created a new container pools on all of the servers and switched everything over to backup to the new pools. I've done some conversions where practical, but I'm doing some of that through attrition. The conversion process is fairly painless and can be stopped and restarted.
So far I'm really happy with the performance of container pools, and it's a ton easier to manage. I have a couple of questions for the more experienced container pool users. 1. As mentioned, exports don't work with container pools. I still need to move nodes around between servers occasionally. I have 3 levels of service. The bottom tier is for archive and data is stored on tape only, which is the cheapest and slowest. Our customers will often decommission a server but want to keep the backup data for some of amount of time. For those people we used to export them to the archive server. We can't do that anymore, which is a bit of a problem. It seems like the only way to do this now is with client replication. We're not using client replication for anything else, but it seems a bit clunky since the a TSM server can only have one replication target for all nodes on the server. It would be pretty much impossible for people who actually do client replication. Is there another way to accomplish this? 2. I've got a support case open with IBM on this, but we're kind of going in circles. This is only happening on 1 of 8 servers that use container pools. For my container directories I'm using 2 TB AIX/JFS2 file systems running off fibre channel connected NetApps. It often fills those file systems right to the brim with 0 bytes free reported from df, which it seems to be Ok most of the time. In the last couple weeks I started getting some errors like this: Sep 26, 2018, 2:15:46 PM ANR0204I The container state for /bucky1dc011/5a/0000000000005a8a.dcf is updated from AVAILABLE to UNAVAILABLE. (PROCESS: 138) Sep 26, 2018, 2:15:46 PM ANR3660E An unexpected error occured while opening or writing to the container. Container /bucky1dc011/5a/0000000000005a8a.dcf in stgpool DCPOOL has been marked as UNAVAILABLE and should be audited to validate accessibility and content. (PROCESS: 138) Sep 26, 2018, 2:15:46 PM ANR0986I Process 138 for Move Container (Automatic) running in the BACKGROUND processed 26,514 items for a total of 8,165,761,024 bytes with a completion state of WARNING at 14:15:46. (PROCESS: 138) The file system reports as full: ]$ df /bucky1dc011 Filesystem 1K-blocks Used Available Use% Mounted on /dev/fslv102 2145386496 2145386496 0 100% /bucky1dc011 So I run an audit on the container. It immediately marks it back as available, even though the audit is not complete. The audit will complete successfully, but it's already back to unavailable before it's done: Sep 26, 2018, 4:58:26 PM ANR4886I Audit Container (Scan) process started for container /bucky1dc011/5a/0000000000005a8a.dcf (process ID 199). (SESSION: 531830, PROCESS: 199) Sep 26, 2018, 4:58:26 PM ANR0984I Process 199 for AUDIT CONTAINER (SCAN) started in the BACKGROUND at 16:58:26. (SESSION: 531830, PROCESS: 199) Sep 26, 2018, 4:58:26 PM ANR0984I Process 198 for AUDIT CONTAINER started in the BACKGROUND at 16:58:26. (SESSION: 531830, PROCESS: 198) Sep 26, 2018, 4:58:27 PM ANR0204I The container state for /bucky1dc011/5a/0000000000005a8a.dcf is updated from UNAVAILABLE to AVAILABLE. (SESSION: 531830, PROCESS: 199) Sep 26, 2018, 4:58:51 PM ANR3660E An unexpected error occured while opening or writing to the container. Container /bucky1dc011/5a/0000000000005a8a.dcf in stgpool DCPOOL has been marked as UNAVAILABLE and should be audited to validate accessibility and content. (PROCESS: 196) Sep 26, 2018, 5:04:13 PM ANR4891I AUDIT CONTAINER process 199 ended for the /bucky1dc011/5a/0000000000005a8a.dcf container: 29207 data extents inspected, 0 data extents marked as damaged, 0 data extents previously marked as damaged reset to undamaged, and 0 data extents marked as orphaned. (SESSION: 531830, PROCESS: 199) Sep 26, 2018, 5:04:13 PM ANR0986I Process 199 for AUDIT CONTAINER (SCAN) running in the BACKGROUND processed 29,207 items for a total of 8,043,383,903 bytes with a completion state of SUCCESS at 17:04:13. (SESSION: 531830, PROCESS: 199) Sep 26, 2018, 5:04:13 PM ANR4013I Audit container process 198 completed audit of 1 containers; 1 successfully audited containers, 0 failed audited containers. (SESSION: 531830, PROCESS: 199) Sep 26, 2018, 5:04:13 PM ANR0987I Process 198 for AUDIT CONTAINER running in the BACKGROUND processed 1 items with a completion state of SUCCESS at 17:04:13. (SESSION: 531830, PROCESS: 199) tsm: BUCKY1>q container /bucky1dc011/5a/0000000000005a8a.dcf f=d Container: /bucky1dc011/5a/0000000000005a8a.dcf Storage Pool Name: DCPOOL Container Type: Dedup State: Unavailable Free Space(MB): 1,879 Maximum Size(MB): 10,104 Approx. Date Last Written: 09/26/2018 16:58:49 Approx. Date Last Audit: 09/26/2018 17:04:13 Cloud Type: Cloud URL: Cloud Object Size (MB): Space Utilized (MB): Data Extent Count: It doesn't mark anything as bad, but as soon as something hits it the container goes back to read only, in this case an automatic container move process hit it. A manual move is not successful either: Sep 26, 2018, 5:23:10 PM ANR0984I Process 215 for Move Container started in the BACKGROUND at 17:23:09. (SESSION: 531830, PROCESS: 215) Sep 26, 2018, 5:23:10 PM ANR2088E An I/O error ocurred while reading container /bucky1dc011/5a/0000000000005a8a.dcf in storage pool DCPOOL. (SESSION: 531830, PROCESS: 215) Sep 26, 2018, 5:23:10 PM ANR0985I Process 215 for Move Container running in the BACKGROUND completed with completion state FAILURE at 17:23:10. (SESSION: 531830, PROCESS: 215) Sep 26, 2018, 5:23:10 PM ANR1893E Process 215 for Move Container completed with a completion state of FAILURE. (SESSION: 531830, PROCESS: 215) After the move it puts the container in read-only state: tsm: BUCKY1>q container /bucky1dc011/5a/0000000000005a8a.dcf f=d Container: /bucky1dc011/5a/0000000000005a8a.dcf Storage Pool Name: DCPOOL Container Type: Dedup State: Read-Only Free Space(MB): 1,881 Maximum Size(MB): 10,104 Approx. Date Last Written: 09/26/2018 16:58:49 Approx. Date Last Audit: 09/26/2018 17:04:13 Cloud Type: Cloud URL: Cloud Object Size (MB): Space Utilized (MB): Data Extent Count: If I don't do the move container and just leave it as unavailable, then protect pool reports warnings. Maybe someone else has encountered and fixed this problem. If so I'd love to know what you did. Thanks! -Kevin -----Original Message----- From: ADSM: Dist Stor Manager <ADSM-L@VM.MARIST.EDU> On Behalf Of Alex Jaimes Sent: Wednesday, September 26, 2018 13:26 To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] CONTAINER pool experiences I echoed Stefan, Rick and Luc... 110% We've been using the directory-container-pools for about 2 years and work great! And yes, plan accordingly and monitor the TSM-DB size as you migrate backups to the container-pools --Alex On Wed, Sep 26, 2018 at 7:31 AM Michaud, Luc [Analyste principal - environnement AIX] <luc.micha...@stm.info> wrote: > Container pools saved the day here too ! > > On our legacy environment (TSM717), adding dedup to our seqpools just > bloated everything, until it became unbearable. > > Migrating nodes to the new blueprint replicated servers w/ > directory-container-pools solved a lot of our issues, especially with > copy-to-tape, as rehydratation is no longer required. > > We do have certain apprehensions with limitations for eventually > migrating from copy-to-tape to copy-to-cloud, but may cheat our way > across with VTL-type gateways if need be. > > Luc > >