2TB archlog? I have never had more that 400GB on any of my systems and have never filled up any of them, until now. You must have a huge amount of backups.
Per your suggestion, we are running nmon for a 24-hour period to see what it comes up with. I am finding that running the DBBackup locally (from the internal 15K disk to the ISILON/NFS mount), is taking considerably longer than what I was doing, which is sending it upstream via 10G to one of my other TSM servers, 2-miles away. Last DBBackup to NFS took 9.5Hours for 1.5TB while the last upstream backup ran 7-hours. Doesn't make any sense at all. I will ask for SSD but the chance of getting 2TB of SSD for a backup replication server, is highly unlikely. There has to be a less expensive way to boost performance. Obviously getting more CPU threads is important. Thank you for all your help/knowledge. It is greatly appreciated! On Wed, Jul 26, 2017 at 3:40 PM, Stefan Folkerts <stefan.folke...@gmail.com> wrote: > Yes, a 300GB archivelog is tiny, that won't work for anything but the > smallest of environments, a believe a medium sized server has a 2TB archive > log. > database backups take a lot of extra time when reorgs and/or (for example) > dereference processes are running on 15K database disks, the system simply > doesn't have the time on the drives to create a speedy database backup > anymore. > Database backups achieve a more consistent and lower duration time when the > database is on SSD's because there is so much performance potential that > doing multiple things no longer bothers the system as much. > > It would surprise me a lot if reducing the memory in the server would fix > the problems, I've never seen anything like that with Spectrum Protect but > I guess there is a first time for everything. :-) > > > > On Wed, Jul 26, 2017 at 4:04 PM, Zoltan Forray <zfor...@vcu.edu> wrote: > > > Another point of interest is the archlog filesystem. We originally had > it > > at 300GB but kept constantly overflowing & crashing since the DB backups > > that trigger at 80% wouldn't finish (>5-hours) before it reached 100%. > So > > we recently increased it to 1TB. Now, the last DBbackup has been running > > for >24-hours and I have been sitting here watching the archlog > filesystem > > %used go from 80% to now 38%. It is taking a long, long time to empty > it, > > even with nothing running but the DBBackup. With nothing but the DBBackup > > (and archlog flushing) running, the load average is still >25. > > > > I really think the additional memory is killing this box. It was never > > this slow or overloaded before! > > > > On Wed, Jul 26, 2017 at 8:26 AM, Stefan Folkerts < > > stefan.folke...@gmail.com> > > wrote: > > > > > Oh, I just now read the 16 threads correctly, I was thinking you wrote > 16 > > > cores! > > > 8 cores is far below specification if your running M-size blueprint > > ingest > > > figures. > > > I've seen 16 core intel servers (2016 spec xeon CPU's) go up to 70% > > > utilization so that kind of load would never work on 8 cores, but > again, > > I > > > don't know how much managed data you have and what your ingest figures > > are. > > > > > > > > > On Wed, Jul 26, 2017 at 2:02 PM, Zoltan Forray <zfor...@vcu.edu> > wrote: > > > > > > > I kinda feel the same way since my networking folks say it isn't the > > 10G > > > > links (Xymon shows peaks of 2Gb), eventhough at it's peak processing > > load > > > > it would be handling 5-TSM servers sending replications across the > same > > > 10G > > > > links also used for the NFS. > > > > > > > > If the current processes ever finish (delete of 9M objects is now > into > > > > 48-hours, I will let the server sit for a day-or-two to see if it > > > > improves. I have noticed that even with the server idle (no > processes > > or > > > > sessions), the CPU load-average was still higher than the 16-threads > > > > available. I am seriously thinking about going back to the original > > 96GB > > > > of RAM since it seems a lot of this slowdown started after bumping to > > > > 192GB. > > > > > > > > On Wed, Jul 26, 2017 at 3:16 AM, Stefan Folkerts < > > > > stefan.folke...@gmail.com> > > > > wrote: > > > > > > > > > Interesting, why would NFS be the problem if the deletion of > objects > > > > > doesn't really touch the storagepools? > > > > > > > > > > I would wager that a straight up dd on the system to create a large > > > file > > > > > via 10Gb/s on NFS would be blazing fast but the database backup is > > slow > > > > > because it's almost never idle, it's always behind it's intern > > > processes > > > > > such as reorgs. > > > > > > > > > > place your bets! :-) > > > > > > > > > > http://www.strawpoll.me/13536369 > > > > > > > > > > > > > > > On Mon, Jul 24, 2017 at 3:55 PM, Sasa Drnjevic < > > sasa.drnje...@srce.hr> > > > > > wrote: > > > > > > > > > > > Not sure of course...But, I would blame NFS > > > > > > > > > > > > Did you check the negotiated speed of your NFS eth 10G ifaces? > > > > > > And that network? > > > > > > > > > > > > Regards, > > > > > > > > > > > > -- > > > > > > Sasa Drnjevic > > > > > > www.srce.unizg.hr > > > > > > > > > > > > > > > > > > On 24.7.2017. 15:49, Zoltan Forray wrote: > > > > > > > 8-cores/16-threads. It wasn't bad when it was replicating from > > > > > 4-SP/TSM > > > > > > > servers. We had to stop all replication due to running out of > > > space > > > > > and > > > > > > > until I finish this cleanup, I have been holding off > replication. > > > > So, > > > > > > the > > > > > > > deletion has been running standalone. > > > > > > > > > > > > > > I forgot to mention that DB backups are also running very long. > > > > 1.5TB > > > > > DB > > > > > > > backup runs 8+hours to NFS storage. These are connected via > 10G. > > > > > > > > > > > > > > On Mon, Jul 24, 2017 at 9:41 AM, Sasa Drnjevic < > > > > sasa.drnje...@srce.hr> > > > > > > > wrote: > > > > > > > > > > > > > >> On 24.7.2017. 15:25, Zoltan Forray wrote: > > > > > > >>> Due to lack of resources, we have had to stop replication on > > one > > > of > > > > > our > > > > > > >> SP > > > > > > >>> servers. The replication target server is 7.1.6.3 RHEL 7, > Dell > > > T710 > > > > > > with > > > > > > >>> 192GB RAM. NFS/ISILON storage. > > > > > > >>> > > > > > > >>> After removing replication from the nodes on source server, I > > > have > > > > > been > > > > > > >>> cleaning up the replication server by deleting the filespaces > > for > > > > the > > > > > > >> nodes > > > > > > >>> we are no longer replicating. > > > > > > >>> > > > > > > >>> My issue is the delete filespaces on the replication server > is > > > > taking > > > > > > >>> forever. It took over a week to delete one filespace with > > > > 31-million > > > > > > >>> objects? > > > > > > >> > > > > > > >> > > > > > > >> That is definitely tooooo loooong :-( > > > > > > >> > > > > > > >> It would take 6-8 hrs max, in my environment even under > > "standard" > > > > > > load... > > > > > > >> > > > > > > >> How many CPU cores does it have? > > > > > > >> > > > > > > >> And how is/was it performing the role of a target repl. server > > > > > > >> performance wise? > > > > > > >> > > > > > > >> Regards, > > > > > > >> > > > > > > >> -- > > > > > > >> Sasa Drnjevic > > > > > > >> www.srce.unizg.hr > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >>> > > > > > > >>> To me it is highly unusual to take this long. Your thoughts > on > > > > this? > > > > > > >>> > > > > > > >>> -- > > > > > > >>> *Zoltan Forray* > > > > > > >>> Spectrum Protect (p.k.a. TSM) Software & Hardware > Administrator > > > > > > >>> Xymon Monitor Administrator > > > > > > >>> VMware Administrator > > > > > > >>> Virginia Commonwealth University > > > > > > >>> UCC/Office of Technology Services > > > > > > >>> www.ucc.vcu.edu > > > > > > >>> zfor...@vcu.edu - 804-828-4807 > > > > > > >>> Don't be a phishing victim - VCU and other reputable > > > organizations > > > > > will > > > > > > >>> never use email to request that you reply with your password, > > > > social > > > > > > >>> security number or confidential personal information. For > more > > > > > details > > > > > > >>> visit http://infosecurity.vcu.edu/phishing.html > > > > > > >>> > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > *Zoltan Forray* > > > > > > > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator > > > > > > > Xymon Monitor Administrator > > > > > > > VMware Administrator > > > > > > > Virginia Commonwealth University > > > > > > > UCC/Office of Technology Services > > > > > > > www.ucc.vcu.edu > > > > > > > zfor...@vcu.edu - 804-828-4807 > > > > > > > Don't be a phishing victim - VCU and other reputable > > organizations > > > > will > > > > > > > never use email to request that you reply with your password, > > > social > > > > > > > security number or confidential personal information. For more > > > > details > > > > > > > visit http://infosecurity.vcu.edu/phishing.html > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > *Zoltan Forray* > > > > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator > > > > Xymon Monitor Administrator > > > > VMware Administrator > > > > Virginia Commonwealth University > > > > UCC/Office of Technology Services > > > > www.ucc.vcu.edu > > > > zfor...@vcu.edu - 804-828-4807 > > > > Don't be a phishing victim - VCU and other reputable organizations > will > > > > never use email to request that you reply with your password, social > > > > security number or confidential personal information. For more > details > > > > visit http://infosecurity.vcu.edu/phishing.html > > > > > > > > > > > > > > > -- > > *Zoltan Forray* > > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator > > Xymon Monitor Administrator > > VMware Administrator > > Virginia Commonwealth University > > UCC/Office of Technology Services > > www.ucc.vcu.edu > > zfor...@vcu.edu - 804-828-4807 > > Don't be a phishing victim - VCU and other reputable organizations will > > never use email to request that you reply with your password, social > > security number or confidential personal information. For more details > > visit http://infosecurity.vcu.edu/phishing.html > > > -- *Zoltan Forray* Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator Xymon Monitor Administrator VMware Administrator Virginia Commonwealth University UCC/Office of Technology Services www.ucc.vcu.edu zfor...@vcu.edu - 804-828-4807 Don't be a phishing victim - VCU and other reputable organizations will never use email to request that you reply with your password, social security number or confidential personal information. For more details visit http://infosecurity.vcu.edu/phishing.html