Re: Looking for suggestions to deal with large backups not completing in 24-hours

Grant Street Wed, 11 Jul 2018 22:26:35 -0700

Hey Zoltan

Key points for backing up isilon:
1 Each isilon node is limited by it's CPU/protocol rather than Networking 
(other than the new G6 F800's )
2 To increase throughput to/from isilon increase the number isilon nodes you 
access via your clients
3 To increase the isilon nodes you access you can either mount the storage 
multiple times from the same client using a different IP, or use TSM proxies.
4 Increase resource utilisation to 10 (max) to increase parallelisation
5 Increase the Max num mount points to be bigger than the number of client 
machines X the resource utilization X the number of SP clients you run per 
client machine. This ensures each session is actively working and not waiting 
for a mount point.
6 Size your Disk storage pool files so that you can have at least 2 X max num 
of mount points. This is so that should you fill your disk storage pool you do 
not have lock contention between migration and backup. Ideally you should have 
enough disk pool storage to do a single run .


We have a setup where we need to do archives of up to 50 TB a day and do this 
using over 24 dsmc's running across 6 Client vm's with a resource utilisation 
of 10.

HTH

Grant

________________________________________
From: ADSM: Dist Stor Manager <ADSM-L@VM.MARIST.EDU> on behalf of Zoltan Forray 
<zfor...@vcu.edu>
Sent: Thursday, 12 July 2018 5:59 AM
To: ADSM-L@VM.MARIST.EDU
Subject: Re: [ADSM-L] Looking for suggestions to deal with large backups not 
completing in 24-hours

Robert,

Thanks for the insight/suggestions.  Your scenario is similar to ours but
on a larger scale when it comes to the amount of data/files to process,
thus the issue (assuming such since you didn't list numbers).  Currently we
have 91 ISILON nodes totaling 140M objects and 230TB of data. The largest
(our troublemaker) has over 21M objects and 26TB of data (this is the one
that takes 4-5 days).  dsminstr.log from a recently finished run shows it
only backed up 15K objects.

We agree that this and other similarly larger nodes need to be broken up
into smaller/less objects to backup per node.  But the owner of this large
one is balking since previously this was backed up via a solitary Windows
server using Journaling so everything finished in a day.

We have never dealt with proxy nodes but might need to head in that
direction since our current method of allowing users to perform their own
restores relies on the now deprecated Web Client.  Our current method is
numerous Windows VM servers with 20-30 nodes defined to each.

How do you handle restore requests?

On Wed, Jul 11, 2018 at 2:56 PM Robert Talda <r...@cornell.edu> wrote:

> Zoltan, et al:
>   :IF: I understand the scenario you outline originally, here at Cornell
> we are using two different approaches in backing up large storage arrays.
>
> 1. For backups of CIFS shares in our Shared File Share service hosted on a
> NetApp device,  we rely on a set of Powershell scripts to build a list of
> shares to backup, then invoke up to 5 SP clients at a time, each client
> backing up a share.  As such, we are able to backup some 200+ shares on a
> daily basis.  I’m not sure this is a good match to your problem...
>
> 2. For backups of a large Dell array containing research data that does
> seem to be a good match, I have defined a set of 10 proxy nodes and 240
> hourly schedules (once each hour for each proxy node) that allows us to
> divide the Dell array up into 240 pieces - pieces that are controlled by
> the specification of the “objects” in the schedule.  That is, in your case,
> instead of associating node <ISILON-SOM-STORAGE> to the schedule
> ISILON-SOM-SOMDFS1 with object " \\rams.adp.vcu.edu\SOM\TSM\SOMADFS1\*”,
> I would instead have something like
> Node PROXY1.ISILON associated to PROXY1.ISILON.HOUR1 for object " \\
> rams.adp.vcu.edu\SOM\TSM\SOMADFS1\DIRA\SUBDIRA\*”
> Node PROXY2.ISILON associated to PROXY1.ISILON.HOUR1 for object " \\
> rams.adp.vcu.edu\SOM\TSM\SOMADFS1\DIRA\SUBDIRB\*”
> …
> Node PROXY1.ISILON associated to PROXY1.ISILON.HOUR2 for object " \\
> rams.adp.vcu.edu\SOM\TSM\SOMADFS1\DIRB\SUBDIRA\*”
>
> And so on.   For known large directories, slots of multiple hours are
> allocated, up to the largest directory which is given its own proxy node
> with one schedule, and hence 24 hours to back up.
>
> There are pros and cons to both of these, but they do enable us to perform
> the backups.
>
> FWIW,
> Bob
>
> Robert Talda
> EZ-Backup Systems Engineer
> Cornell University
> +1 607-255-8280
> r...@cornell.edu
>
>
> > On Jul 11, 2018, at 7:49 AM, Zoltan Forray <zfor...@vcu.edu> wrote:
> >
> > I will need to translate to English but I gather it is talking about the
> > RESOURCEUTILZATION / MAXNUMMP values.  While we have increased MAXNUMMP
> to
> > 5 on the server (will try going higher),  not sure how much good it would
> > do since the backup schedule uses OBJECTS to point to a specific/single
> > mountpoint/filesystem (see below) but is worth trying to bump the
> > RESOURCEUTILIZATION value on the client even higher...
> >
> > We have checked the dsminstr.log file and it is spending 92% of the time
> in
> > PROCESS DIRS (no surprise)
> >
> > 7:46:25 AM   SUN : q schedule * ISILON-SOM-SOMADFS1 f=d
> >            Policy Domain Name: DFS
> >                 Schedule Name: ISILON-SOM-SOMADFS1
> >                   Description: ISILON-SOM-SOMADFS1
> >                        Action: Incremental
> >                     Subaction:
> >                       Options: -subdir=yes
> >                       Objects: \\rams.adp.vcu.edu\SOM\TSM\SOMADFS1\*
> >                      Priority: 5
> >               Start Date/Time: 12/05/2017 08:30:00
> >                      Duration: 1 Hour(s)
> >    Maximum Run Time (Minutes): 0
> >                Schedule Style: Enhanced
> >                        Period:
> >                   Day of Week: Any
> >                         Month: Any
> >                  Day of Month: Any
> >                 Week of Month: Any
> >                    Expiration:
> > Last Update by (administrator): ZFORRAY
> >         Last Update Date/Time: 01/12/2018 10:30:48
> >              Managing profile:
> >
> >
> > On Tue, Jul 10, 2018 at 4:06 AM Jansen, Jonas <jan...@itc.rwth-aachen.de
> >
> > wrote:
> >
> >> It is possible to da a parallel backup of file system parts.
> >> https://clicktime.symantec.com/a/1/dm3q-cs8Bb1W4hGGSeUGCjyxN9LhH5W4kf2bkxedzfs=?d=mjxQcKgEvht2k1KHCcJY55t0cLlZa8SFn0QAyfY8UqiZOjJurg5d8SgLWX9BcbYfC6TGitrf-Ph2qfRepCgyaWDjSKNYodAOoJSjKm0_Lq9uxM6_ptcYMoV6GzJnAmAfWcaNZzQ4A9feVI55VylreexrcLA66V2wonKc7Dn9n03NNlWFNKCuzHI3PfbYx5U3sJqnr9GUTThkdcgADHGJ-64rNTyo4oiW1piKfzrVaGjuTz1oHLpWiLpTJ6I-pk1I5wOx9zQ1Es36KQOlxEqBi9VnMvzxCDZ1D2iu-cpO5g0HaUOVE4V5PkzCKObae2O39d3x8azoIWIZlBcw_9eJvGqRpkpBZQELN3lp7_822QOz7WZkaqN_wM6aq2g1ardVVltA5BZrutLWwg4Hl_RtyDnL-5u0WWOG22tLhXMq_B2-DZ0URIaTTZNlxSbd3kb0XLwOS1t_TiJro3s8A92uaENtehVhaXcSOAQoJA%3D%3D&u=https%3A%2F%2Fwww.gwdg.de%2Fdocuments%2F20182%2F27257%2FGN_11-2016_www.pdf
> >>  (german)
> >> have a
> >> look on page 10.
> >>
> >> ---
> >> Jonas Jansen
> >>
> >> IT Center
> >> Gruppe: Server & Storage
> >> Abteilung: Systeme & Betrieb
> >> RWTH Aachen University
> >> Seffenter Weg 23
> >> 52074 Aachen
> >> Tel: +49 241 80-28784
> >> Fax: +49 241 80-22134
> >> jan...@itc.rwth-aachen.de
> >> www.itc.rwth-aachen.de
> >>
> >> -----Original Message-----
> >> From: ADSM: Dist Stor Manager <ADSM-L@VM.MARIST.EDU> On Behalf Of Del
> >> Hoobler
> >> Sent: Monday, July 9, 2018 3:29 PM
> >> To: ADSM-L@VM.MARIST.EDU
> >> Subject: Re: [ADSM-L] Looking for suggestions to deal with large backups
> >> not
> >> completing in 24-hours
> >>
> >> They are a 3rd-party partner that offers an integrated Spectrum Protect
> >> solution for large filer backups.
> >>
> >>
> >> Del
> >>
> >> ----------------------------------------------------
> >>
> >> "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 07/09/2018
> >> 09:17:06 AM:
> >>
> >>> From: Zoltan Forray <zfor...@vcu.edu>
> >>> To: ADSM-L@VM.MARIST.EDU
> >>> Date: 07/09/2018 09:17 AM
> >>> Subject: Re: Looking for suggestions to deal with large backups not
> >>> completing in 24-hours
> >>> Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
> >>>
> >>> Thanks Del.  Very interesting.  Are they a VAR for IBM?
> >>>
> >>> Not sure if it would work in the current configuration we are using to
> >> back
> >>> up ISILON. I have passed the info on.
> >>>
> >>> BTW, FWIW, when I copied/pasted the info, Chrome spell-checker
> >> red-flagged
> >>> on "The easy way to incrementally backup billons of objects"
> (billions).
> >>> So if you know anybody at the company, please pass it on to them.
> >>>
> >>> On Mon, Jul 9, 2018 at 6:51 AM Del Hoobler <hoob...@us.ibm.com> wrote:
> >>>
> >>>> Another possible idea is to look at General Storage dsmISI MAGS:
> >>>>
> >>>>        INVALID URI REMOVED
> >>>
> >>
> >>
> u=http-3A__www.general-2Dstorage.com_PRODUCTS_products.html&d=DwIBaQ&c=jf_ia
> >> SHvJObTbx-
> >>>
> >>
> >>
> siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=ofZM7gZ7p5GL1HFyHU75
> >> lwUZLmc_kYAQxroVCZQUCSs&s=25_psxEcE0fvxruxybvMJZzSZv-
> >>> ach7r-VHXaLNVD_E&e=
> >>>>
> >>>>
> >>>> Del
> >>>>
> >>>>
> >>>> "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 07/05/2018
> >>>> 02:52:27 PM:
> >>>>
> >>>>> From: Zoltan Forray <zfor...@vcu.edu>
> >>>>> To: ADSM-L@VM.MARIST.EDU
> >>>>> Date: 07/05/2018 02:53 PM
> >>>>> Subject: Looking for suggestions to deal with large backups not
> >>>>> completing in 24-hours
> >>>>> Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
> >>>>>
> >>>>> As I have mentioned in the past, we have gone through large
> >> migrations
> >>>> to
> >>>>> DFS based storage on EMC ISILON hardware.  As you may recall, we
> >> backup
> >>>>> these DFS mounts (about 90 at last count) using multiple Windows
> >> servers
> >>>>> that run multiple ISP nodes (about 30-each) and they access each DFS
> >>>>> mount/filesystem via -object=\\rams.adp.vcu.edu\departmentname.
> >>>>>
> >>>>> This has lead to lots of performance issue with backups and some
> >>>>> departments are now complain that their backups are running into
> >>>>> multiple-days in some cases.
> >>>>>
> >>>>> One such case in a department with 2-nodes with over 30-million
> >> objects
> >>>> for
> >>>>> each node.  In the past, their backups were able to finish quicker
> >> since
> >>>>> they were accessed via dedicated servers and were able to use
> >> Journaling
> >>>> to
> >>>>> reduce the scan times.  Unless things have changed, I believe
> >> Journling
> >>>> is
> >>>>> not an option due to how the files are accessed.
> >>>>>
> >>>>> FWIW, average backups are usually <50k files and <200GB once it
> >> finished
> >>>>> scanning.....
> >>>>>
> >>>>> Also, the idea of HSM/SPACEMANAGEMENT has reared its ugly head since
> >>>> many
> >>>>> of these objects haven't been accessed in many years old. But as I
> >>>>> understand it, that won't work either given our current
> >> configuration.
> >>>>>
> >>>>> Given the current DFS configuration (previously CIFS), what can we
> >> do to
> >>>>> improve backup performance?
> >>>>>
> >>>>> So, any-and-all ideas are up for discussion.  There is even
> >> discussion
> >>>> on
> >>>>> replacing ISP/TSM due to these issues/limitations.
> >>>>>
> >>>>> --
> >>>>> *Zoltan Forray*
> >>>>> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> >>>>> Xymon Monitor Administrator
> >>>>> VMware Administrator
> >>>>> Virginia Commonwealth University
> >>>>> UCC/Office of Technology Services
> >>>>> www.ucc.vcu.edu
> >>>>> zfor...@vcu.edu - 804-828-4807
> >>>>> Don't be a phishing victim - VCU and other reputable organizations
> >> will
> >>>>> never use email to request that you reply with your password, social
> >>>>> security number or confidential personal information. For more
> >> details
> >>>>> visit INVALID URI REMOVED
> >>>>> u=http-3A__phishing.vcu.edu_&d=DwIBaQ&c=jf_iaSHvJObTbx-
> >>>>> siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=5bz_TktY3-
> >>>>> a432oKYronO-w1z-
> >>>>> ax8md3tzFqX9nGxoU&s=EudIhVvfUVx4-5UmfJHaRUzHCd7Agwk3Pog8wmEEpdA&e=
> >>>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> *Zoltan Forray*
> >>> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> >>> Xymon Monitor Administrator
> >>> VMware Administrator
> >>> Virginia Commonwealth University
> >>> UCC/Office of Technology Services
> >>> www.ucc.vcu.edu
> >>> zfor...@vcu.edu - 804-828-4807
> >>> Don't be a phishing victim - VCU and other reputable organizations will
> >>> never use email to request that you reply with your password, social
> >>> security number or confidential personal information. For more details
> >>> visit INVALID URI REMOVED
> >>> u=http-3A__phishing.vcu.edu_&d=DwIBaQ&c=jf_iaSHvJObTbx-
> >>>
> >>
> >>
> siA1ZOg&r=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8&m=ofZM7gZ7p5GL1HFyHU75
> >> lwUZLmc_kYAQxroVCZQUCSs&s=umTd28h-
> >>> GlxqSvNShsNIqm8D1PcanVk0HPcP5KTurKw&e=
> >>>
> >>
> >
> >
> > --
> > *Zoltan Forray*
> > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> > Xymon Monitor Administrator
> > VMware Administrator
> > Virginia Commonwealth University
> > UCC/Office of Technology Services
> > www.ucc.vcu.edu
> > zfor...@vcu.edu - 804-828-4807
> > Don't be a phishing victim - VCU and other reputable organizations will
> > never use email to request that you reply with your password, social
> > security number or confidential personal information. For more details
> > visit 
> > https://clicktime.symantec.com/a/1/2GCySDV8toqsd_TR5kWeMunzAtyi1r_uZwRqBmkZ0Ms=?d=mjxQcKgEvht2k1KHCcJY55t0cLlZa8SFn0QAyfY8UqiZOjJurg5d8SgLWX9BcbYfC6TGitrf-Ph2qfRepCgyaWDjSKNYodAOoJSjKm0_Lq9uxM6_ptcYMoV6GzJnAmAfWcaNZzQ4A9feVI55VylreexrcLA66V2wonKc7Dn9n03NNlWFNKCuzHI3PfbYx5U3sJqnr9GUTThkdcgADHGJ-64rNTyo4oiW1piKfzrVaGjuTz1oHLpWiLpTJ6I-pk1I5wOx9zQ1Es36KQOlxEqBi9VnMvzxCDZ1D2iu-cpO5g0HaUOVE4V5PkzCKObae2O39d3x8azoIWIZlBcw_9eJvGqRpkpBZQELN3lp7_822QOz7WZkaqN_wM6aq2g1ardVVltA5BZrutLWwg4Hl_RtyDnL-5u0WWOG22tLhXMq_B2-DZ0URIaTTZNlxSbd3kb0XLwOS1t_TiJro3s8A92uaENtehVhaXcSOAQoJA%3D%3D&u=http%3A%2F%2Fphishing.vcu.edu%2F
>
>

--
*Zoltan Forray*
Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
Xymon Monitor Administrator
VMware Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
www.ucc.vcu.edu
zfor...@vcu.edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit 
https://clicktime.symantec.com/a/1/2GCySDV8toqsd_TR5kWeMunzAtyi1r_uZwRqBmkZ0Ms=?d=mjxQcKgEvht2k1KHCcJY55t0cLlZa8SFn0QAyfY8UqiZOjJurg5d8SgLWX9BcbYfC6TGitrf-Ph2qfRepCgyaWDjSKNYodAOoJSjKm0_Lq9uxM6_ptcYMoV6GzJnAmAfWcaNZzQ4A9feVI55VylreexrcLA66V2wonKc7Dn9n03NNlWFNKCuzHI3PfbYx5U3sJqnr9GUTThkdcgADHGJ-64rNTyo4oiW1piKfzrVaGjuTz1oHLpWiLpTJ6I-pk1I5wOx9zQ1Es36KQOlxEqBi9VnMvzxCDZ1D2iu-cpO5g0HaUOVE4V5PkzCKObae2O39d3x8azoIWIZlBcw_9eJvGqRpkpBZQELN3lp7_822QOz7WZkaqN_wM6aq2g1ardVVltA5BZrutLWwg4Hl_RtyDnL-5u0WWOG22tLhXMq_B2-DZ0URIaTTZNlxSbd3kb0XLwOS1t_TiJro3s8A92uaENtehVhaXcSOAQoJA%3D%3D&u=http%3A%2F%2Fphishing.vcu.edu%2F
--
Grant Street
Senior Systems Engineer

T: +61 2 9383 4800 (main)
D: +61 2 8310 3582 (direct)
E: grant.str...@al.com.au

Building 54 / FSA #19, Fox Studios Australia, 38 Driver Avenue
Moore Park, NSW 2021
AUSTRALIA

  [LinkedIn] <https://www.linkedin.com/company/animal-logic>   [Facebook] 
<https://www.facebook.com/Animal-Logic-129284263808191/>   [Twitter] 
<https://twitter.com/AnimalLogic>   [Instagram] 
<https://www.instagram.com/animallogicstudios/>
[Animal Logic]<http://www.peterrabbit-movie.com>

Check out our awesome NEW website 
www.animallogic.com<http://www.animallogic.com>

CONFIDENTIALITY AND PRIVILEGE NOTICE
This email is intended only to be read or used by the addressee. It is 
confidential and may contain privileged information. If you are not the 
intended recipient, any use, distribution, disclosure or copying of this email 
is strictly prohibited. Confidentiality and legal privilege attached to this 
communication are not waived or lost by reason of the mistaken delivery to you. 
If you have received this email in error, please delete it and notify us 
immediately by telephone or email.

Re: Looking for suggestions to deal with large backups not completing in 24-hours

Reply via email to