Hello Andrew,

Is /dev/changer created by udev rules? Have you tried /dev/sgX instead? Can
you send us the output of the "lsscsi -l" command and "dmesg | grep
Attached"? Have you checked your drives/autochanger using just mtx/mt
commands to see if they are working? Which is your mtx version?

Best regards,
Ana

On Wed, Jun 17, 2015 at 7:29 PM, Marcin Haba <ganius...@gmail.com> wrote:

> Hello,
>
> Do you have any errors in dmesg (hardware errors, bus reset, SCSI
> errors ... etc.) ?
>
> Best regards,
> Marcin Haba (gani)
>
> 2015-06-17 21:56 GMT+02:00 Andrew Noonan <anoo...@gmail.com>:
> > Hi all,
> >
> >      It's taking a lot longer because of the higher timeouts, but the
> > label is still failing with a termination.  If I understand it
> > correctly, the mtx-changer script is polling with 'mt' looking for the
> > $ready state, defined in the config file as ONLINE (for Linux).  I'm
> > not seeing drive 0 go into that state... I just see:
> >
> > SCSI 2 tape drive:
> > File number=-1, block number=-1, partition=0.
> > Tape block size 0 bytes. Density code 0x0 (default).
> > Soft error count since last status=0
> > General status bits on (50000):
> >  DR_OPEN IM_REP_EN
> >
> > the other device looks like:
> >
> > SCSI 2 tape drive:
> > File number=0, block number=0, partition=0.
> > Tape block size 0 bytes. Density code 0x5a (no translation).
> > Soft error count since last status=0
> > General status bits on (41010000):
> >  BOT ONLINE IM_REP_EN
> >
> > So I see that it's ~possible~ to see the ONLINE state, but it doesn't
> > seem like it ever gets to that state during load.
> >
> > Any thoughts?
> >
> > Thanks,
> > Andrew
> >
> > On Wed, Jun 17, 2015 at 11:44 AM, Andrew Noonan <anoo...@gmail.com>
> wrote:
> >> Hi Ana,
> >>
> >>      Thanks for the reply.  I'm adding those into the drives.  BTW,
> >> 900 is the value.  Having no real experience with these, is it
> >> abnormal for a load to take the 10+ minutes, or is that reasonable?
> >> My next step is to add those settings in, restart the SD, and attempt
> >> to do a "label barcode" again.
> >>
> >> Thanks,
> >> Andrew
> >>
> >> On Tue, Jun 16, 2015 at 9:10 PM, Ana Emília M. Arruda
> >> <emiliaarr...@gmail.com> wrote:
> >>> Hello Andrew,
> >>>
> >>> You can find in the output of a "lsscsi -l" command the timeout for
> your
> >>> drives. Then you can configure 3 timeout directives for each one of
> your two
> >>> drives (LRADrive-1 e LRADrive-2):
> >>>
> >>> Maximum Changer Wait = X
> >>> Maximum Rewind Wait = X
> >>> Maximum Open Wait = X
> >>>
> >>> where X is the timeout value for your dirves.
> >>>
> >>> You can also customize your mtx-changer script for this timeout
> changing the
> >>> bellow 300 seconds value:
> >>>
> >>> wait_for_drive() {
> >>>   i=0
> >>>   while [ $i -le 300 ]; do  # Wait max 300 seconds
> >>>
> >>> Best regards,
> >>> Ana
> >>>
> >>>
> >>> On Tue, Jun 16, 2015 at 5:02 PM, Andrew Noonan <anoo...@gmail.com>
> wrote:
> >>>>
> >>>> Hi all,
> >>>>
> >>>> I'm almost completely new to tape.  We've been doing disk-based
> >>>> backups for years, but we now have a project where we want to offsite
> >>>> hundreds of TB permanently, and have a Dell TL4000 (a rebranded IBM
> >>>> 3573-TL from the looks of it) with 2 ULT3580 LTO-6 drives.  We're
> >>>> running bacula 5.2.  The server is a Dell 1950 running Centos 5 (sorry
> >>>> for the old OS).
> >>>>
> >>>> The btape tests run on both units without a problem, including the
> >>>> autochanger tests, and manually executing load/unload/list commands
> >>>> with mtx-changer seem to run fine.  The one exception to this is that
> >>>> the mtx-changer load command seems to take about 10 minutes to
> >>>> complete, which seems unreasonably long.  These are brand new tapes
> >>>> and I haven't written anything to them other then whatever btape does
> >>>> with testing.  I put a 5 minute sleep on the load for mtx-changer, but
> >>>> other then that haven't customized the script, as I'm not sure what
> >>>> I'd customize.
> >>>>
> >>>> The "update slots" command from the director works OK, but when I go
> >>>> to do a "label barcode", the resulting "load slot" gets killed by
> >>>> Bacula:
> >>>>
> >>>> 3992 Bad autochanger "load slot 20, drive 1": ERR=Child died from
> >>>> signal 15: Termination.
> >>>> Results=Program killed by Bacula (timeout)
> >>>>
> >>>> I've seen that in some of these posts to the list, this ends up being
> >>>> permissions problems against the devices, but that doesn't seem to be
> >>>> the case as far as I can see:
> >>>>
> >>>> bacula-sd is running as the bacula user/group.  The bacula user is in
> >>>> the "disk" group, and the *st* devices are in the disk group with "rw"
> >>>> permissions:
> >>>>
> >>>> crw-rw---- 1 root disk 9, 128 Jun  4 12:02 /dev/nst0
> >>>> crw-rw---- 1 root disk 9, 224 Jun  4 12:02 /dev/nst0a
> >>>> crw-rw---- 1 root disk 9, 160 Jun  4 12:02 /dev/nst0l
> >>>> crw-rw---- 1 root disk 9, 192 Jun  4 12:02 /dev/nst0m
> >>>> crw-rw---- 1 root disk 9, 129 Jun 10 17:06 /dev/nst1
> >>>> crw-rw---- 1 root disk 9, 225 Jun 10 17:06 /dev/nst1a
> >>>> crw-rw---- 1 root disk 9, 161 Jun 10 17:06 /dev/nst1l
> >>>> crw-rw---- 1 root disk 9, 193 Jun 10 17:06 /dev/nst1m
> >>>> crw-rw---- 1 root disk 9,   0 Jun  4 12:02 /dev/st0
> >>>> crw-rw---- 1 root disk 9,  96 Jun  4 12:02 /dev/st0a
> >>>> crw-rw---- 1 root disk 9,  32 Jun  4 12:02 /dev/st0l
> >>>> crw-rw---- 1 root disk 9,  64 Jun  4 12:02 /dev/st0m
> >>>> crw-rw---- 1 root disk 9,   1 Jun 10 17:06 /dev/st1
> >>>> crw-rw---- 1 root disk 9,  97 Jun 10 17:06 /dev/st1a
> >>>> crw-rw---- 1 root disk 9,  33 Jun 10 17:06 /dev/st1l
> >>>> crw-rw---- 1 root disk 9,  65 Jun 10 17:06 /dev/st1m
> >>>>
> >>>> Here's a block of debug from the SD during a label attempt for one of
> the
> >>>> slots:
> >>>>
> >>>> odin-sd: autochanger.c:434-0 Wiffle through devices looking for slot
> >>>> odin-sd: autochanger.c:313-0 Locking changer LogRepoAutochanger
> >>>> odin-sd: autochanger.c:740-0 omsg=/usr/lib64/bacula/mtx-changer
> >>>> /dev/changer loaded 14 /dev/nst0 0
> >>>> odin-sd: autochanger.c:272-0 Run program=/usr/lib64/bacula/mtx-changer
> >>>> /dev/changer loaded 14 /dev/nst0 0
> >>>> odin-sd: watchdog.c:206-0 Registered watchdog 636b888, interval 300
> >>>> odin-sd: bpipe.c:220-0 Wait for 28962 opt=1
> >>>> odin-sd: bpipe.c:228-0 Got break wpid=28962 status=0 ERR=none
> >>>> odin-sd: bpipe.c:249-0 child status=0
> >>>> odin-sd: watchdog.c:226-0 Unregistered watchdog 636b888
> >>>> odin-sd: bpipe.c:264-0 returning stat=0,0
> >>>> odin-sd: autochanger.c:274-0 run_prog: /usr/lib64/bacula/mtx-changer
> >>>> /dev/changer loaded 14 /dev/nst0 0 stat=0 result=0
> >>>> odin-sd: autochanger.c:327-0 Unlocking changer LogRepoAutochanger
> >>>> odin-sd: autochanger.c:313-0 Locking changer LogRepoAutochanger
> >>>> odin-sd: autochanger.c:740-0 omsg=/usr/lib64/bacula/mtx-changer
> >>>> /dev/changer loaded 14 /dev/nst1 1
> >>>> odin-sd: autochanger.c:272-0 Run program=/usr/lib64/bacula/mtx-changer
> >>>> /dev/changer loaded 14 /dev/nst1 1
> >>>> odin-sd: watchdog.c:206-0 Registered watchdog 636b888, interval 300
> >>>> odin-sd: bpipe.c:220-0 Wait for 28976 opt=1
> >>>> odin-sd: bpipe.c:228-0 Got break wpid=28976 status=0 ERR=none
> >>>> odin-sd: bpipe.c:249-0 child status=0
> >>>> odin-sd: watchdog.c:226-0 Unregistered watchdog 636b888
> >>>> odin-sd: bpipe.c:264-0 returning stat=0,0
> >>>> odin-sd: autochanger.c:274-0 run_prog: /usr/lib64/bacula/mtx-changer
> >>>> /dev/changer loaded 14 /dev/nst1 1 stat=0 result=0
> >>>> odin-sd: autochanger.c:327-0 Unlocking changer LogRepoAutochanger
> >>>> odin-sd: autochanger.c:453-0 Slot=14 not found in another device
> >>>> odin-sd: autochanger.c:313-0 Locking changer LogRepoAutochanger
> >>>> odin-sd: autochanger.c:183-0 Doing changer load slot 14 "LRADrive-2"
> >>>> (/dev/nst1)
> >>>> odin-sd: autochanger.c:740-0 omsg=/usr/lib64/bacula/mtx-changer
> >>>> /dev/changer load 14 /dev/nst1 1
> >>>> odin-sd: dev.c:1746-0 close_dev "LRADrive-2" (/dev/nst1)
> >>>> odin-sd: dev.c:1751-0 device "LRADrive-2" (/dev/nst1) already closed
> vol=
> >>>> odin-sd: autochanger.c:190-0 Run program=/usr/lib64/bacula/mtx-changer
> >>>> /dev/changer load 14 /dev/nst1 1
> >>>> odin-sd: watchdog.c:206-0 Registered watchdog 636b888, interval 300
> >>>> odin-sd: bpipe.c:443-0 Run program fgets killed=1
> >>>> odin-sd: bpipe.c:220-0 Wait for 28990 opt=1
> >>>> odin-sd: bpipe.c:228-0 Got break wpid=28990 status=15 ERR=none
> >>>> odin-sd: bpipe.c:256-0 Child died from signal 15
> >>>> odin-sd: watchdog.c:235-0 Unregistered inactive watchdog 636b888
> >>>> odin-sd: bpipe.c:264-0 returning stat=15,134217743
> >>>> odin-sd: autochanger.c:205-0 load slot 14, drive 1, bad stats=Child
> >>>> died from signal 15: Termination.
> >>>> odin-sd: autochanger.c:212-0 load slot 14 status=134217743
> >>>> odin-sd: autochanger.c:327-0 Unlocking changer LogRepoAutochanger
> >>>> odin-sd: autochanger.c:218-0 After changer, status=134217743
> >>>> odin-sd: dev.c:1735-0 Clear volhdr vol=
> >>>> odin-sd: vol_mgr.c:544-0 vol_unused: no vol on "LRADrive-2"
> (/dev/nst1)
> >>>> odin-sd: lock.c:302-0 return lock. old=BST_WRITING_LABEL from
> dircmd.c:554
> >>>> odin-sd: lock.c:307-0 return lock. new=BST_NOT_BLOCKED
> >>>> odin-sd: dev.c:1746-0 close_dev "LRADrive-2" (/dev/nst1)
> >>>> odin-sd: dev.c:1751-0 device "LRADrive-2" (/dev/nst1) already closed
> vol=
> >>>> odin-sd: acquire.c:731-0 Enter detach_dcr_from_dev
> >>>> odin-sd: dircmd.c:220-0 <dird: label LogRepoAutochanger
> >>>> VolumeName=000030L6 PoolName=LogrepoArchive MediaType=LTO-6 Slot=15
> >>>> drive=1
> >>>> odin-sd: dircmd.c:234-0 Do command: label
> >>>> odin-sd: dircmd.c:627-0 Try changer device LRADrive-1
> >>>> odin-sd: dircmd.c:648-0 Device LogRepoAutochanger drive wrong: want=1
> >>>> got=0 skipping
> >>>> odin-sd: dircmd.c:627-0 Try changer device LRADrive-2
> >>>> odin-sd: dircmd.c:643-0 Found changer device LRADrive-2
> >>>> odin-sd: dircmd.c:656-0 Found device LRADrive-2
> >>>> odin-sd: block.c:144-0 Returning new block=636b800
> >>>> odin-sd: acquire.c:713-0 JobId=0 enter attach_dcr_to_dev
> >>>> odin-sd: dircmd.c:421-0 Can label. Device is not open
> >>>> odin-sd: lock.c:285-0 steal lock. old=BST_NOT_BLOCKED from
> dircmd.c:470
> >>>> odin-sd: lock.c:290-0 steal lock. new=BST_WRITING_LABEL
> >>>> odin-sd: dircmd.c:471-0 Stole device "LRADrive-2" (/dev/nst1) lock,
> >>>> writing label.
> >>>>
> >>>> The config I've got for these is:
> >>>>
> >>>> Device {
> >>>>   Name = LRADrive-1
> >>>>   Alert Command = "sh -c 'smartctl -H -l error %c'"
> >>>>   AlwaysOpen = yes
> >>>>   ArchiveDevice = /dev/nst0
> >>>>   AutoChanger = yes
> >>>>   AutomaticMount = yes
> >>>>   DeviceType = Tape
> >>>>   DriveIndex = 0
> >>>>   LabelMedia = no
> >>>>   MediaType = LTO-6
> >>>>   RandomAccess = no
> >>>>   RemovableMedia = yes
> >>>> }
> >>>>
> >>>> Device {
> >>>>   Name = LRADrive-2
> >>>>   Alert Command = "sh -c 'smartctl -H -l error %c'"
> >>>>   AlwaysOpen = yes
> >>>>   ArchiveDevice = /dev/nst1
> >>>>   AutoChanger = yes
> >>>>   AutomaticMount = yes
> >>>>   DeviceType = Tape
> >>>>   DriveIndex = 1
> >>>>   LabelMedia = no
> >>>>   MediaType = LTO-6
> >>>>   RandomAccess = no
> >>>>   RemovableMedia = yes
> >>>> }
> >>>>
> >>>> Autochanger {
> >>>>   Name = LogRepoAutochanger
> >>>>   ChangerCommand = "/usr/lib64/bacula/mtx-changer %c %o %S %a %d"
> >>>>   ChangerDevice = /dev/changer
> >>>>   Device = LRADrive-1
> >>>>   Device = LRADrive-2
> >>>> }
> >>>>
> >>>>
> >>>> I know there are some things that could be optimized here for
> >>>> performance, and I'm certainly interested in them, but right now I
> >>>> can't even label my tapes :)
> >>>>
> >>>> I suspect it's the long load delay, and I wasn't sure if maybe the
> >>>> drive is searching for some mark or something.  On that note, I tried
> >>>> to do a "rewind" and "weof" using the /dev/st0 device (wasn't sure if
> >>>> nst0 would complain about issuing a rewind), but I would get
> >>>> "Input/Output error" messages from mt on both the rewind and weof
> >>>> commands.
> >>>>
> >>>> Any advice I could get would be helpful.
> >>>>
> >>>> Thanks!
> >>>> Andrew
> >>>>
> >>>>
> >>>>
> ------------------------------------------------------------------------------
> >>>> _______________________________________________
> >>>> Bacula-users mailing list
> >>>> Bacula-users@lists.sourceforge.net
> >>>> https://lists.sourceforge.net/lists/listinfo/bacula-users
> >>>
> >>>
> >
> >
> ------------------------------------------------------------------------------
> > _______________________________________________
> > Bacula-users mailing list
> > Bacula-users@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/bacula-users
>
>
>
> --
> "Większej miłości nikt nie ma nad tę, jak gdy kto życie swoje kładzie
> za przyjaciół swoich." Jezus Chrystus
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Bacula-users mailing list
> Bacula-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
>
------------------------------------------------------------------------------
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to