from:"bugzilla\-daemon"

[Bug 98171] [Regression] Marvell SE91xx SATA 3 controllers not recognized correctly

2015-06-29 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=98171

--- Comment #1 from nuc...@hotmail.com ---
> This is a regression against 3.14? There was only one change affecting
> mvsas between those kernels and that was the libsas change:
> 
> commit bc6e7c4b0d1a1f742d96556f63d68f17f4e232c3
> Author: Dan Williams 
> Date: Fri Mar 14 13:52:48 2014 -0700
> 
> libata, libsas: kill pm_result and related cleanup
> 
> Does reverting that make the problem go away?
> 

It's a regression starting with linux 3.15, meaning everything still worked
with linux 3.14.
I have tried reverting that patch from linux-3.15.10 vanilla sources but it
fails... I don't know why, maybe I'm just not leet enough:

Hunk #1 succeeded at 5364 (offset 13 lines).
Hunk #2 succeeded at 5388 (offset 13 lines).
Hunk #3 FAILED at 5390.
Hunk #4 succeeded at 5506 (offset 13 lines).
Hunk #5 succeeded at 5532 (offset 13 lines).
1 out of 5 hunks FAILED -- saving rejects to file drivers/ata/libata-core.c.rej
patching file drivers/ata/libata-eh.c
patching file drivers/scsi/libsas/sas_ata.c
patching file include/linux/libata.h
Hunk #1 succeeded at 850 (offset 2 lines).
Hunk #2 FAILED at 1140.
1 out of 2 hunks FAILED -- saving rejects to file include/linux/libata.h.rej
patching file include/scsi/libsas.h
patch unexpectedly ends in middle of line
Hunk #1 succeeded at 172 with fuzz 1.


What shall I do?

Regards,

Philip

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 98171] [Regression] Marvell SE91xx SATA 3 controllers not recognized correctly

2015-06-30 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=98171

Tom Yan  changed:

   What|Removed |Added

 CC||tom.t...@gmail.com

--- Comment #2 from Tom Yan  ---
Why don't you get linux 3.14 (linux-lts in the current Arch) and patch it with
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/patch/?id=bc6e7c4b0d1a1f742d96556f63d68f17f4e232c3
and see if it reproduce the issue?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 98171] [Regression] Marvell SE91xx SATA 3 controllers not recognized correctly

2015-06-30 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=98171

--- Comment #3 from Tom Yan  ---
By the way why would it be related to mvsas/libsas? Aren't those controllers
SATA ones which use the ahci driver?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 71021] WARNING: CPU: 0 PID: 5517 at /build/buildd/linux-3.13.0/fs/sysfs/group.c:214 sysfs_remove_group+0xc6/0xd0()

2015-07-03 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=71021

Dāvis  changed:

   What|Removed |Added

 CC||davis...@gmail.com

--- Comment #2 from Dāvis  ---
I've similar call trace on 4.0.7. I did "echo 1 >
/sys/block/sde/device/delete", removed that disk and removed another disk
without it. Then saw this in journal:


kernel: [ cut here ]
kernel: WARNING: CPU: 3 PID: 7857 at fs/sysfs/group.c:219
sysfs_remove_group+0xa1/0xb0()
kernel: sysfs group 81893860 not found for kobject 'end_device-7:3'
kernel: Modules linked in: nls_iso8859_4 nls_iso8859_1 nls_cp437 xt_tcpudp
ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack
ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables
ip6table_mangle ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
ip6table_raw ip6table_security ip6table_filter ip6_tables iptable_mangle
iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack
iptable_raw iptable_security iptable_filter nls_utf8 nls_cp775 vfat fat
saa7134_alsa tuner_simple tuner_types tda8290 tuner nvidia(PO) saa7134
videobuf2_dma_sg tveeprom rc_core videobuf2_memops videobuf2_core edac_core
edac_mce_amd joydev mousedev mxm_wmi snd_usb_audio snd_usbmidi_lib snd_rawmidi
v4l2_common videodev snd_seq_device kvm_amd evdev kvm serio_raw
kernel:  media crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd r8169 mac_hid pcspkr
fam15h_power tpm_infineon mii k10temp tpm_tis wmi snd_hda_codec_hdmi drm tpm
snd_hda_codec_realtek sp5100_tco snd_hda_codec_generic acpi_cpufreq
snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm snd_timer snd
soundcore button shpchp processor i2c_piix4 i2c_core sch_fq_codel ip_tables
x_tables btrfs xor raid6_pq hid_generic usbhid hid sd_mod atkbd libps2
crc32c_intel firewire_ohci firewire_core crc_itu_t ohci_pci ahci ehci_pci
ohci_hcd libahci ehci_hcd mvsas libsas scsi_transport_sas xhci_pci libata
xhci_hcd usbcore scsi_mod usb_common i8042 serio
kernel: CPU: 3 PID: 7857 Comm: kworker/u16:4 Tainted: P   O   
4.0.7-2-ARCH #1
kernel: Hardware name: Gigabyte Technology Co., Ltd.
GA-990FXA-UD3/GA-990FXA-UD3, BIOS FFe 11/08/2013
kernel: Workqueue: scsi_wq_7 sas_destruct_devices [libsas]
kernel:   1fac2762 880167ccbbf8 81574ec3
kernel:   880167ccbc50 880167ccbc38 81074e7a
kernel:  880167ccbc38  81893860 880222e77810
kernel: Call Trace:
kernel:  [] dump_stack+0x4c/0x6e
kernel:  [] warn_slowpath_common+0x8a/0xc0
kernel:  [] warn_slowpath_fmt+0x55/0x70
kernel:  [] ? kernfs_find_and_get_ns+0x4c/0x60
kernel:  [] sysfs_remove_group+0xa1/0xb0
kernel:  [] dpm_sysfs_remove+0x57/0x60
kernel:  [] device_del+0x58/0x270
kernel:  [] device_unregister+0x22/0x80
kernel:  [] bsg_unregister_queue+0x60/0xc0
kernel:  [] sas_rphy_remove+0x4c/0x80 [scsi_transport_sas]
kernel:  [] sas_rphy_delete+0x16/0x30 [scsi_transport_sas]
kernel:  [] sas_destruct_devices+0x65/0x90 [libsas]
kernel:  [] process_one_work+0x14b/0x470
kernel:  [] worker_thread+0x48/0x4b0
kernel:  [] ? init_pwq.part.7+0x10/0x10
kernel:  [] ? init_pwq.part.7+0x10/0x10
kernel:  [] kthread+0xd8/0xf0
kernel:  [] ? schedule+0x37/0x90
kernel:  [] ? kthread_worker_fn+0x170/0x170
kernel:  [] ret_from_fork+0x58/0x90
kernel:  [] ? kthread_worker_fn+0x170/0x170
kernel: ---[ end trace 82d1791291d041c8 ]---
kernel: [ cut here ]
kernel: WARNING: CPU: 3 PID: 7857 at fs/sysfs/group.c:219
sysfs_remove_group+0xa1/0xb0()
kernel: sysfs group 81893860 not found for kobject 'end_device-7:3'
kernel: Modules linked in: nls_iso8859_4 nls_iso8859_1 nls_cp437 xt_tcpudp
ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT ...
kernel:  media crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel
aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd ...
kernel: CPU: 3 PID: 7857 Comm: kworker/u16:4 Tainted: PW  O   
4.0.7-2-ARCH #1
kernel: Hardware name: Gigabyte Technology Co., Ltd.
GA-990FXA-UD3/GA-990FXA-UD3, BIOS FFe 11/08/2013
kernel: Workqueue: scsi_wq_7 sas_destruct_devices [libsas]
kernel:   1fac2762 880167ccbb88 81574ec3
kernel:   880167ccbbe0 880167ccbbc8 81074e7a
kernel:  880167ccbbc8  81893860 880222ec6838
kernel: Call Trace:
kernel:  [] dump_stack+0x4c/0x6e
kernel:  [] warn_slowpath_common+0x8a/0xc0
kernel:  [] warn_slowpath_fmt+0x55/0x70
kernel:  [] ? kernfs_find_and_get_ns+0x4c/0x60
kernel:  [] sysfs_remove_group+0xa1/0xb0
kernel:  [] dpm_sysfs_remove+0x57/0x60
kernel:  [] device_del+0x58/0x270
kernel:  [] ? device_remove_file+0x19/0x20
kernel:  [] attribute_container_class_device_del+0x1e/0x30
kernel:  [] transport_r

[Bug 100921] New: Kernel cannot read partition table automatically.But use partprobe command can.

2015-07-05 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=100921

Bug ID: 100921
   Summary: Kernel cannot read partition table automatically.But
use partprobe command can.
   Product: IO/Storage
   Version: 2.5
Kernel Version: 4.1.0
  Hardware: x86-64
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: SCSI
  Assignee: linux-scsi@vger.kernel.org
  Reporter: ha...@126.com
Regression: No

I use linux kernel 4.1.0 in my Gentoo Linux.
Whenever I insert my USB-HDD into USB-port, kernel cannot read the partition
table automatically.
But when I use the partprobe command, it can read my partition table.
I want to know whether it is a bug.

As the following, the output of the "dmesg|tail":
dmesg | tail 
[ 5384.806237] usb-storage 1-4:1.0: USB Mass Storage device detected 
[ 5384.807159] scsi host8: usb-storage 1-4:1.0 
[ 5386.440476] scsi 8:0:0:0: Direct-Access hp   v250w1100
PQ: 0 ANSI: 4 
[ 5386.440660] sd 8:0:0:0: Attached scsi generic sg2 type 0 
[5386.440899] sd 8:0:0:0: [sdb] 31506432 512-byte logical blocks: (16.1 GB/15.0
GiB) 
[ 5386.441314] sd 8:0:0:0: [sdb] Write Protect is off 
[ 5386.441319] sd 8:0:0:0: [sdb] Mode Sense: 43 00 00 00 
[ 5386.441758] sd 8:0:0:0: [sdb] No Caching mode page found 
[ 5386.441762] sd 8:0:0:0: [sdb] Assuming drive cache: write through 
[ 5386.445789] sd 8:0:0:0: [sdb] Attached SCSI removable disk

ls /dev/sd*
/dev/sda  /dev/sda1  /dev/sda2  /dev/sda3  /dev/sda4  /dev/sda5  /dev/sda6 
/dev/sda7  /dev/sda8  /dev/sdb
As above, there is not /dev/sdbX(X=1,2,3...)
Then, I use "sudo partprobe" 
ls /dev/sd*
/dev/sda  /dev/sda1  /dev/sda2  /dev/sda3  /dev/sda4  /dev/sda5  /dev/sda6 
/dev/sda7  /dev/sda8  /dev/sdb  /dev/sdb1
As above, the /dev/sdb1 occurred.

I want to know whether it is a bug.I didn't want to insert "partprobe" command
whenever I use my USB devices.

Yours sincerely
Thanks.

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 100921] Kernel cannot read partition table automatically.But use partprobe command can.

2015-07-05 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=100921

Wallance  changed:

   What|Removed |Added

  Component|SCSI|USB
Product|IO/Storage  |Drivers

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 100921] Kernel cannot read partition table automatically.But use partprobe command can.

2015-07-05 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=100921

--- Comment #1 from Wallance  ---
My USB devices have VFAT filesystem.

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101011] New: Kernel Oops when disconnecting a mounted ext4 usb stick

2015-07-05 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101011

Bug ID: 101011
   Summary: Kernel Oops when disconnecting a mounted ext4 usb
stick
   Product: IO/Storage
   Version: 2.5
Kernel Version: 4.1.1-040101-generic
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: SCSI
  Assignee: linux-scsi@vger.kernel.org
  Reporter: konra...@gmail.com
Regression: No

I am running Linux Mint Mate 17.2 with an update Ubuntu mainline kernel
downloaded from here:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.1.1-unstable/

When I disconnect a mounted ext4 USB stick without properly unmounting it
first, the kernel crashes and I need to reboot. Here the relevant logs, I can
reproduce this every time I try:

Jul  4 22:26:52 Lenny kernel: [  807.592356]  sdb: sdb1
Jul  4 22:26:52 Lenny kernel: [  807.595350] sd 3:0:0:0: [sdb] Attached SCSI
removable disk
Jul  4 22:26:53 Lenny kernel: [  807.826241] EXT4-fs (sdb1): mounted filesystem
with ordered data mode. Opts: (null)
Jul  4 22:27:02 Lenny kernel: [  817.461405] usb 2-2: USB disconnect, device
number 2
Jul  4 22:27:02 Lenny kernel: [  817.490648] Buffer I/O error on dev sdb1,
logical block 15237120, lost sync page write
Jul  4 22:27:02 Lenny kernel: [  817.490655] JBD2: Error -5 detected when
updating journal superblock for sdb1-8.
Jul  4 22:27:02 Lenny kernel: [  817.490873] BUG: unable to handle kernel
paging request at 34beb000
Jul  4 22:27:02 Lenny kernel: [  817.490929] IP: []
__percpu_counter_add+0x18/0xc0
Jul  4 22:27:02 Lenny kernel: [  817.490977] *pdpt = 23db9001 *pde =
 
Jul  4 22:27:02 Lenny kernel: [  817.491024] Oops:  [#1] SMP 
Jul  4 22:27:02 Lenny kernel: [  817.491056] Modules linked in: uas usb_storage
ctr ccm msr snd_hda_codec_analog snd_hda_codec_generic dm_multipath scsi_dh
pcmcia coretemp kvm_intel kvm snd_seq_midi snd_seq_midi_event snd_rawmidi arc4
snd_seq yenta_socket iwl3945 serio_raw thinkpad_acpi iwlegacy mac80211
snd_hda_intel nvram snd_hda_controller snd_seq_device snd_hda_codec
snd_hda_core snd_hwdep pcmcia_rsrc lpc_ich btusb pcmcia_core cfg80211 snd_pcm
btbcm btintel rfcomm snd_timer shpchp bnep bluetooth snd soundcore 8250_fintek
parport_pc ppdev tp_smapi(OE) thinkpad_ec(OE) mac_hid lp parport dm_mirror
dm_region_hash dm_log i915 e1000e i2c_algo_bit sdhci_pci psmouse drm_kms_helper
ahci ptp libahci sdhci drm pps_core video
Jul  4 22:27:02 Lenny kernel: [  817.491694] CPU: 0 PID: 4083 Comm: umount
Tainted: G U OE   4.1.1-040101-generic #201507011435
Jul  4 22:27:02 Lenny kernel: [  817.491761] Hardware name: LENOVO
7675CTO/7675CTO, BIOS 7NETC2WW (2.22 ) 03/22/2011
Jul  4 22:27:02 Lenny kernel: [  817.491814] task: ebf06b50 ti: ebebc000
task.ti: ebebc000
Jul  4 22:27:02 Lenny kernel: [  817.491853] EIP: 0060:[] EFLAGS:
00010082 CPU: 0
Jul  4 22:27:02 Lenny kernel: [  817.491894] EIP is at
__percpu_counter_add+0x18/0xc0
Jul  4 22:27:02 Lenny kernel: [  817.491931] EAX: f21c8e88 EBX: f21c8e88 ECX:
 EDX: 0001
Jul  4 22:27:02 Lenny kernel: [  817.491975] ESI: 0001 EDI:  EBP:
ebebde60 ESP: ebebde40
Jul  4 22:27:02 Lenny kernel: [  817.492018]  DS: 007b ES: 007b FS: 00d8 GS:
00e0 SS: 0068
Jul  4 22:27:02 Lenny kernel: [  817.492057] CR0: 8005003b CR2: 34beb000 CR3:
33354200 CR4: 07f0
Jul  4 22:27:02 Lenny kernel: [  817.492100] Stack:
Jul  4 22:27:02 Lenny kernel: [  817.492117]  c1abe100 edcb0098 edcb00ec
 f21c8e68  f21c8e68 f286d160
Jul  4 22:27:02 Lenny kernel: [  817.492198]  ebebde84 c1160454 0010
0282 f72a77f8 0984 f72a77f8 f286d160
Jul  4 22:27:02 Lenny kernel: [  817.492277]  f286d170 ebebdea0 c11e613f
 0282 f72a77f8 edd7f4d0 
Jul  4 22:27:02 Lenny kernel: [  817.492355] Call Trace:
Jul  4 22:27:02 Lenny kernel: [  817.492379]  []
account_page_dirtied+0x74/0x110
Jul  4 22:27:02 Lenny kernel: [  817.492420]  []
__set_page_dirty+0x3f/0xb0
Jul  4 22:27:02 Lenny kernel: [  817.492459]  []
mark_buffer_dirty+0x53/0xc0
Jul  4 22:27:02 Lenny kernel: [  817.492497]  []
ext4_commit_super+0x17b/0x250
Jul  4 22:27:02 Lenny kernel: [  817.492535]  []
ext4_put_super+0xc1/0x320
Jul  4 22:27:02 Lenny kernel: [  817.492572]  [] ?
fsnotify_unmount_inodes+0x1aa/0x1c0
Jul  4 22:27:02 Lenny kernel: [  817.492615]  [] ?
evict_inodes+0xca/0xe0
Jul  4 22:27:02 Lenny kernel: [  817.492653]  []
generic_shutdown_super+0x6a/0xe0
Jul  4 22:27:02 Lenny kernel: [  817.492695]  [] ?
prepare_to_wait_event+0xd0/0xd0
Jul  4 22:27:02 Lenny kernel: [  817.492736]  [] ?
unregister_shrinker+0x40/0x50
Jul  4 22:27:02 Lenny kernel: [  817.492775]  []
kill_block_super+0x26/0x70
Jul  4 22:27:02 Lenny kernel: [  817.492815]  []
deactivate_locked_super+0x45/0x80
Jul  4 22:27:02 Lenny kernel: [  817.492854]  []
deactivate_super+0x47/0x60
Jul  4 22:27:02 Lenny kernel: [  817.49

[Bug 101011] Kernel Oops when disconnecting a mounted ext4 usb stick

2015-07-05 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101011

--- Comment #1 from konra...@gmail.com ---
Created attachment 181941
  --> https://bugzilla.kernel.org/attachment.cgi?id=181941&action=edit
uname -a

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101011] Kernel Oops when disconnecting a mounted ext4 usb stick

2015-07-05 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101011

--- Comment #3 from konra...@gmail.com ---
Created attachment 181961
  --> https://bugzilla.kernel.org/attachment.cgi?id=181961&action=edit
dmesg

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101011] Kernel Oops when disconnecting a mounted ext4 usb stick

2015-07-05 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101011

--- Comment #2 from konra...@gmail.com ---
Created attachment 181951
  --> https://bugzilla.kernel.org/attachment.cgi?id=181951&action=edit
cat /proc/version

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101011] Kernel Oops when disconnecting a mounted ext4 usb stick

2015-07-05 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101011

--- Comment #4 from konra...@gmail.com ---
Created attachment 181971
  --> https://bugzilla.kernel.org/attachment.cgi?id=181971&action=edit
lspci -vvnn

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101201] New: hpsa hang when creating ext4 FS

2015-07-08 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101201

Bug ID: 101201
   Summary: hpsa hang when creating ext4 FS
   Product: SCSI Drivers
   Version: 2.5
Kernel Version: 3.16.7-ckt11-1
  Hardware: x86-64
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Other
  Assignee: scsi_drivers-ot...@kernel-bugs.osdl.org
  Reporter: elac...@easter-eggs.com
Regression: No

Hardware:

HP ProLiant DL385p Gen8 with 2 SSD 100G and 2 HD SATA 3TB, configured in HBA
mode, no HW RAID set up.

I boot on the Debian 8 installer, then


- load hpsa
- create one partition on each of the SSD
- create a soft RAID 1 using mdadm on those 2 partition
- raid synchronize successfully
- create a pv, then a vg on this raid without problem
- doing mkswap on a LV without problem
- doing mkfs.ext4 on a LV -> the process HANG

dmesg gives:

[ 1490.566398] hpsa :03:00.0: Abort request on C6:B2:T0:L0
[ 1490.566819] hpsa :03:00.0: ABORT REQUEST on C6:B2:T0:L0
Tag:0x:0030 Command:0x42 SN:0x1687d9  REQUEST SUCCEEDED.
[ 1540.309302] hpsa :03:00.0: ABORT REQUEST on C6:B2:T0:L0
Tag:0x:0030 Command:0x42 SN:0x1687d9  FAILED. Aborted command has
not completed after 30 seconds.
[ 1540.309319] hpsa :03:00.0: Abort request on C6:B2:T1:L0
[ 1540.345047] hpsa :03:00.0: ABORT REQUEST on C6:B2:T1:L0
Tag:0x:0010 Command:0x42 SN:0x1687d8  REQUEST SUCCEEDED.
[ 1588.796036] hpsa :03:00.0: ABORT REQUEST on C6:B2:T1:L0
Tag:0x:0010 Command:0x42 SN:0x1687d8  FAILED. Aborted command has
not completed after 30 seconds.
[ 1588.796090] hpsa :03:00.0: resetting device 6:2:0:0


the mkfs process doesn't finish and I can no longer access disks (parted,
etc..)

I can repeat this on two identical servers.

I looked at kernel changelogs and did not see anything that may fix this on
recent releases.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 100921] Kernel cannot read partition table automatically.But use partprobe command can.

2015-07-08 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=100921

--- Comment #2 from Wallance  ---
Oh,I have solved it.

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101371] New: OOPS: unplugging western digital passport drive

2015-07-12 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101371

Bug ID: 101371
   Summary: OOPS: unplugging western digital passport drive
   Product: IO/Storage
   Version: 2.5
Kernel Version: 4.0.7
  Hardware: Intel
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: high
  Priority: P1
 Component: SCSI
  Assignee: linux-scsi@vger.kernel.org
  Reporter: ila...@gmail.com
Regression: No

Created attachment 182471
  --> https://bugzilla.kernel.org/attachment.cgi?id=182471&action=edit
kernel log

Hi,

I can reproducibly crash my laptop by unplugging a western digital passport
drive.
The flow is as follow:

1. plugin in passport drive
2. wait until drivers are loaded
3. plug in or unplug power cable
4. unplug passport

This only happens after changing the power state of the laptop. So if I plug in
the passport drive while on BAT (or AC) and plug in out while on AC (or BAT)
the system crashes.

I think this is the same bug but for an older kernel version:
https://www.mail-archive.com/linux-scsi@vger.kernel.org/msg37689.html

Kernel log is attached.

Friendly regards,

Ilan Cohen

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101371] OOPS: unplugging western digital passport drive

2015-07-18 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101371

Joe Lawrence  changed:

   What|Removed |Added

 CC||joe.lawre...@stratus.com

--- Comment #1 from Joe Lawrence  ---
Hi Ilan,

Have you tried the patch to drivers/scsi/scsi_pm.c that Alan posted in the
thread you mentioned:

https://www.mail-archive.com/linux-scsi@vger.kernel.org/msg37852.html

Thanks,

-- Joe

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101371] OOPS: unplugging western digital passport drive

2015-07-20 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101371

--- Comment #2 from Ilan Cohen  ---
Hi Joe,

I applied the patch Alan posted to kernel 4.0.8 and the crash does not occur
anymore.

Regards,

Ilan Cohen

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101781] New: kernel BUG at block/blk-core.c:1217!

2015-07-20 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101781

Bug ID: 101781
   Summary: kernel BUG at block/blk-core.c:1217!
   Product: IO/Storage
   Version: 2.5
Kernel Version: 3.10.0
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: SCSI
  Assignee: linux-scsi@vger.kernel.org
  Reporter: tomsunc...@gmail.com
Regression: No

Created attachment 183221
  --> https://bugzilla.kernel.org/attachment.cgi?id=183221&action=edit
the request, request_queue, scsi_cmnd struct info

[ 1001.043824] Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
[ 1001.043827] sd 1:0:0:5: [sdg] CDB: 
[ 1001.043828] Read(10): 28 00 00 00 00 88 00 00 78 00
[ 1001.043834] end_request: I/O error, dev sdg, sector 136
[ 1031.878134] qla2xxx [:08:00.0]-801c:1: Abort command issued nexus=1:0:5
--  1 2002.
[ 1072.919498] qla2xxx [:08:00.0]-801c:1: Abort command issued nexus=1:0:5
--  1 2002.
[ 1103.819949] qla2xxx [:08:00.0]-801c:1: Abort command issued nexus=1:0:3
--  1 2002.
[ 1105.029568] qla2xxx [:08:00.0]-801c:1: Abort command issued nexus=1:0:5
--  1 2002.
[ 1106.032392] qla2xxx [:08:00.0]-801c:1: Abort command issued nexus=1:0:5
--  1 2002.
[ 1137.070991] qla2xxx [:08:00.0]-8009:1: DEVICE RESET ISSUED nexus=1:0:3
cmd=880424b68e00.
[ 1137.073202] qla2xxx [:08:00.0]-800e:1: DEVICE RESET SUCCEEDED
nexus:1:0:3 cmd=880424b68e00.
[ 1137.074163] sd 1:0:0:5: [sdg]  
[ 1137.074197] Sense Key : No Sense [current] 
[ 1137.074203] sd 1:0:0:5: [sdg]  
[ 1137.074206] Add. Sense: No additional sense information
[ 1153.562495] [ cut here ]
[ 1153.562607] kernel BUG at block/blk-core.c:1217!
[ 1153.562678] invalid opcode:  [#1] SMP 
[ 1153.562746] Modules linked in: gfs2 dlm sctp sg xt_CHECKSUM iptable_mangle
ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
xt_conntrack nf_conntrack ipt_REJECT tun bridge stp llc ebtable_filter ebtables
ip6table_filter ip6_tables iptable_filter ip_tables iscsi_tcp libiscsi_tcp
libiscsi scsi_transport_iscsi openvswitch vxlan ip_tunnel gre iTCO_wdt
iTCO_vendor_support coretemp crct10dif_pclmul crc32_pclmul dm_service_time
crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper
ablk_helper ipmi_devintf cryptd serio_raw pcspkr hpilo hpwdt i7core_edac
lpc_ich ipmi_si mfd_core edac_core shpchp ipmi_msghandler acpi_power_meter
pcc_cpufreq mperf register_ipmc_reboot(OF) ifb kvm_intel kvm binfmt_misc
dm_multipath xfs libcrc32c sr_mod cdrom sd_mod ata_generic pata_acpi
[ 1153.564866]  crc_t10dif crct10dif_common radeon i2c_algo_bit drm_kms_helper
qla2xxx ttm tg3 ata_piix drm scsi_transport_fc ptp libata i2c_core hpsa
scsi_tgt pps_core dm_mirror dm_region_hash dm_log dm_mod
[ 1153.565195] CPU: 33 PID: 0 Comm: swapper/33 Tainted: GF 
O--   3.10.0-123.el7.x86_64 #1
[ 1153.565331] Hardware name: HP ProLiant DL580 G7, BIOS P65 07/01/2013
[ 1153.565420] task: 880427ceb8e0 ti: 880427cf8000 task.ti:
880427cf8000
[ 1153.565621] RIP: 0010:[]  []
blk_requeue_request+0x97/0xa0
[ 1153.566136] RSP: 0018:88143f6c3e08  EFLAGS: 00010082
[ 1153.566365] RAX: fff2 RBX: 881425f73000 RCX:
dead00200200
[ 1153.566548] RDX:  RSI: 881427745380 RDI:
0002
[ 1153.566652] RBP: 88143f6c3e20 R08: 8814277454d0 R09:

[ 1153.566751] R10:  R11: 0001 R12:
881427745380
[ 1153.571678] R13: 880424b79680 R14: 8800be4e9180 R15:
880427f3e828
[ 1153.576748] FS:  () GS:88143f6c()
knlGS:
[ 1153.582295] CS:  0010 DS:  ES:  CR0: 8005003b
[ 1153.587101] CR2: 7fcadd7e7ee8 CR3: 018e CR4:
07e0
[ 1153.592485] DR0:  DR1:  DR2:

[ 1153.597998] DR3:  DR6: 0ff0 DR7:
0400
[ 1153.602863] Stack:
[ 1153.608184]  881425f73000 880424b79680 0202
88143f6c3e68
[ 1153.613920]  813e7a48 0097 0286
8800be4e9180
[ 1153.619616]  2001 0002bf20 0006
0001
[ 1153.625120] Call Trace:
[ 1153.630042]   
[ 1153.630171] 
[ 1153.635622]  [] __scsi_queue_insert+0x98/0x120
[ 1153.641325]  [] scsi_softirq_done+0xd2/0x160
[ 1153.646999]  [] blk_done_softirq+0x90/0xc0
[ 1153.652708]  [] __do_softirq+0xf7/0x290
[ 1153.658445]  [] call_softirq+0x1c/0x30
[ 1153.664254]  [] do_softirq+0x55/0xa0
[ 1153.670070]  [] irq_exit+0x25d/0x270
[ 1153.675837]  []
smp_call_function_single_interrupt+0x35/0x40
[ 1153.681772]  [] call_function_single_interrupt+0x6d/0x80
[ 1153.687744]   
[ 1153.687773] 
[ 1153.693735]  [] ? lapic_next_event+0x1d/0x30
[ 1153.699792]  [] ? finish_task_switch+0x53/0x170
[ 1153.705869]  [] __

[Bug 101011] Kernel Oops when disconnecting a mounted ext4 usb stick

2015-07-22 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101011

taz-...@latribu.com changed:

   What|Removed |Added

 CC||taz-...@latribu.com

--- Comment #5 from taz-...@latribu.com ---
Can also reproduce on a non-tainted kernel (4.1.2) on an old laptop:

jui 21 10:19:15 Aspire kernel: usb 3-3.3.4.1.1: USB disconnect, device number
15
jui 21 10:19:15 Aspire kernel: BUG: unable to handle kernel paging request at
34943000
jui 21 10:19:16 Aspire kernel: IP: [] __percpu_counter_add+0x1b/0xd0
jui 21 10:19:16 Aspire kernel: *pde =  
jui 21 10:19:17 Aspire kernel: Oops:  [#1] PREEMPT SMP 
jui 21 10:19:17 Aspire kernel: Modules linked in: joydev psmouse
snd_hda_codec_hdmi pcspkr serio_raw iTCO_wdt iTCO_vendor_support i2c_i801 evdev
mousedev mac_hid i915 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel
ipw2200 snd_hda_controller snd_hda_codec drm_kms_helper 8139too libipw
snd_hda_core drm 8139cp lib80211 snd_hwdep cfg80211 snd_pcm pcmcia i2c_algo_bit
i2c_core mii rfkill snd_timer lpc_ich intel_agp yenta_socket intel_gtt agpgart
pcmcia_rsrc pcmcia_core snd rng_core soundcore thermal battery shpchp video ac
button acpi_cpufreq processor sch_fq_codel ip_tables x_tables ext4 crc16
mbcache jbd2 hid_generic usbhid hid uas usb_storage sr_mod cdrom sd_mod
ata_generic pata_acpi atkbd libps2 ata_piix libata scsi_mod ehci_pci uhci_hcd
ehci_hcd usbcore usb_common i8042 serio
jui 21 10:19:17 Aspire kernel: CPU: 0 PID: 616 Comm: umount Not tainted
4.1.2-2-ARCH #1
jui 21 10:19:17 Aspire kernel: Hardware name: Acer, inc. Aspire 1640Z   
/Lugano3 , BIOS 3A24 10/30/06
jui 21 10:19:17 Aspire kernel: task: f4183fc0 ti: f1a7c000 task.ti: f1a7c000
jui 21 10:19:17 Aspire kernel: EIP: 0060:[] EFLAGS: 00010082 CPU: 0
jui 21 10:19:17 Aspire kernel: EIP is at __percpu_counter_add+0x1b/0xd0
jui 21 10:19:17 Aspire kernel: EAX: c1cee508 EBX: c1cee508 ECX:  EDX:
0001
jui 21 10:19:17 Aspire kernel: ESI:  EDI:  EBP: f1a7de70 ESP:
f1a7de50
jui 21 10:19:17 Aspire kernel:  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
jui 21 10:19:17 Aspire kernel: CR0: 8005003b CR2: 34943000 CR3: 31aa CR4:
07d0
jui 21 10:19:17 Aspire kernel: Stack:
jui 21 10:19:17 Aspire kernel:  0285 0285 c166f540 0001 
c1cee4e8  f53a635c
jui 21 10:19:17 Aspire kernel:  f1a7de8c c1138904 0010 f6879660 f6879660
f53a635c f53a636c f1a7dea8
jui 21 10:19:18 Aspire kernel:  c11bd549  0282 f6879660 f521bc78
f2f5c400 f1a7deb8 c11bd6a3
jui 21 10:19:18 Aspire kernel: Call Trace:
jui 21 10:19:18 Aspire kernel:  [] account_page_dirtied+0x74/0x120
jui 21 10:19:18 Aspire kernel:  [] __set_page_dirty+0x39/0xb0
jui 21 10:19:19 Aspire kernel:  [] mark_buffer_dirty+0x53/0xd0
jui 21 10:19:19 Aspire kernel:  [] ext4_commit_super+0x158/0x230
[ext4]
jui 21 10:19:19 Aspire kernel:  [] ? mb_cache_shrink+0x55/0x250
[mbcache]
jui 21 10:19:19 Aspire kernel:  [] ext4_put_super+0xc7/0x320 [ext4]
jui 21 10:19:19 Aspire kernel:  [] ? dispose_list+0x32/0x40
jui 21 10:19:20 Aspire kernel:  [] ? evict_inodes+0xf2/0x110
jui 21 10:19:20 Aspire kernel:  [] generic_shutdown_super+0x64/0xe0
jui 21 10:19:20 Aspire kernel:  [] ? unregister_shrinker+0x40/0x50
jui 21 10:19:20 Aspire kernel:  [] kill_block_super+0x1f/0x70
jui 21 10:19:20 Aspire kernel:  [] deactivate_locked_super+0x3d/0x70
jui 21 10:19:21 Aspire kernel:  [] deactivate_super+0x57/0x60
jui 21 10:19:21 Aspire kernel:  [] cleanup_mnt+0x39/0x90
jui 21 10:19:21 Aspire kernel:  [] __cleanup_mnt+0x10/0x20
jui 21 10:19:21 Aspire kernel:  [] task_work_run+0xc9/0xe0
jui 21 10:19:21 Aspire kernel:  [] do_notify_resume+0x75/0x80
jui 21 10:19:22 Aspire kernel:  [] work_notifysig+0x30/0x37
jui 21 10:19:22 Aspire kernel: Code: 39 c7 77 de eb da 8d 76 00 8d bc 27 00 00
00 00 55 89 e5 57 56 53 89 c3 83 ec 14 89 55 ec 89 4d f0 64 ff 05 44 17 71 c1
8b 7b 14 <64> 8b 37 89 7d e0 89 f7 c1 ff 1f 01 d6 8b 55 08 11 cf 89 d1 c1
jui 21 10:19:22 Aspire kernel: EIP: [] __percpu_counter_add+0x1b/0xd0
SS:ESP 0068:f1a7de50
jui 21 10:19:22 Aspire kernel: CR2: 34943000
jui 21 10:19:22 Aspire kernel: ---[ end trace 4bafb307c38dbc3e ]---

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101841] New: iommu memory handling cause mmblock devices stop working

2015-07-23 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101841

Bug ID: 101841
   Summary: iommu memory handling cause mmblock devices stop
working
   Product: IO/Storage
   Version: 2.5
Kernel Version: 4.0.x, 4.1.x
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: SCSI
  Assignee: linux-scsi@vger.kernel.org
  Reporter: dron...@gmail.com
Regression: No

copying from transcend 16Gb sdhc I card on laptop slowdowns after 100-300Mb

dmesg repeatedly spam about 

[  336.101876] DMA: Out of SW-IOMMU space for 65536 bytes at device
:16:00.0
[  336.106637] DMA: Out of SW-IOMMU space for 65536 bytes at device
:16:00.0
[  336.106637] [ cut here ]
[  336.106637] WARNING: CPU: 2 PID: 0 at drivers/mmc/host/sdhci.c:856
sdhci_send_command+0x8e4/0xc30 [sdhci]()
[  336.106637] Modules linked in: nls_iso8859_1 nls_cp437 vfat fat mmc_block
fuse xt_multiport xt_CHECKSUM iptable_mangle ipt_MASQUERADE
nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4
nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp
tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables
iptable_filter uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core
v4l2_common videodev media joydev arc4 brcmsmac nvidia(PO) cordic brcmutil
mac80211 cfg80211 snd_hda_codec_hdmi rfkill r8169 mousedev ir_lirc_codec drm
snd_hda_codec_realtek snd_hda_codec_generic lirc_dev snd_hda_intel
snd_hda_controller evdev mac_hid snd_hda_codec snd_hda_core mii bcma
ir_rc6_decoder ir_sanyo_decoder ir_nec_decoder jmb38x_ms i2c_core
ir_sharp_decoder ir_mce_kbd_decoder
[  336.106637]  ir_jvc_decoder ir_rc5_decoder ir_xmp_decoder ir_sony_decoder
memstick rc_rc6_mce ene_ir rc_core coretemp crc32c_intel snd_hwdep psmouse
snd_pcm snd_timer snd serio_raw toshiba_haps toshiba_bluetooth mei_me soundcore
mei shpchp battery ac video button sch_fq_codel msr sparse_keymap loop
kvm_intel kvm cpuid cpufreq_stats cpufreq_userspace cpufreq_conservative
cpufreq_powersave acpi_cpufreq processor ip_tables x_tables ext4 crc16 mbcache
jbd2 sr_mod cdrom sd_mod atkbd libps2 ahci libahci libata scsi_mod ehci_pci
ehci_hcd sdhci_pci sdhci led_class usbcore mmc_core usb_common i8042 serio lz4
lz4_compress
[  336.106637] CPU: 2 PID: 0 Comm: swapper/2 Tainted: PW  O   
4.1.2-2-ARCH #1
[  336.106637] Hardware name: TOSHIBA Satellite A660/NWQAA, BIOS 1.60 07/23/10
[  336.106637]   1c0fa02e044d6e83 88023bc83d18
81585c8e
[  336.106637]    88023bc83d58
81078c9a
[  336.106637]  880231cff4c0 880231cff4c0 88020fcf8290
88020fcf8310
[  336.106637] Call Trace:
[  336.106637][] dump_stack+0x4c/0x6e
[  336.106637]  [] warn_slowpath_common+0x8a/0xc0
[  336.106637]  [] warn_slowpath_null+0x1a/0x20
[  336.106637]  [] sdhci_send_command+0x8e4/0xc30 [sdhci]
[  336.106637]  [] ? ktime_get+0x37/0xb0
[  336.106637]  [] sdhci_finish_command+0x15c/0x170 [sdhci]
[  336.106637]  [] sdhci_irq+0x329/0x990 [sdhci]
[  336.106637]  [] ? enqueue_hrtimer+0x29/0xa0
[  336.106637]  [] handle_irq_event_percpu+0x3e/0x1f0
[  336.106637]  [] handle_irq_event+0x41/0x70
[  336.106637]  [] handle_fasteoi_irq+0x82/0x130
[  336.106637]  [] handle_irq+0x22/0x40
[  336.106637]  [] do_IRQ+0x4f/0xf0
[  336.106637]  [] common_interrupt+0x6e/0x6e
[  336.106637][] ? cpuidle_enter_state+0x8f/0x240
[  336.106637]  [] ? cpuidle_enter_state+0x64/0x240
[  336.106637]  [] cpuidle_enter+0x17/0x20
[  336.106637]  [] cpu_startup_entry+0x31c/0x450
[  336.106637]  [] start_secondary+0x196/0x1e0
[  336.106637] ---[ end trace 9cf1bf48caa4616d ]---

swiotlb=131072 make it works a bit longer. just about 1Gb without such logs,
but than return back to error state

GRUB_CMDLINE_LINUX="resume=/dev/sda2 nomodeset notsc clocksource=acpi_pm
nmi_watchdog=0 numa=off fastboot usbhid.mousepoll=2
init=/usr/lib/systemd/systemd enable_mtrr_cleanup intel_pstate=disable
zswap.enabled=1 zswap.max_pool_percent=27 zswap.compressor=lz4
drm.vblankoffdelay=1 audit=0 pcie_aspm=powersave pcie_ports=native
drm_kms_helper.poll=0 intel_iommu=on swiotlb=131072"

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101841] iommu memory handling cause mmblock devices stop working

2015-07-23 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101841

--- Comment #1 from Ivan  ---
intel_iommu=off does not help, but switched to 3.14.48 lts and successfully
tranfered all the data without any error

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101891] New: mvsas prep failed, NULL pointer dereference in mvs_slot_task_free+0x5/0x1f0 [mvsas]

2015-07-23 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101891

Bug ID: 101891
   Summary: mvsas prep failed, NULL pointer dereference in
mvs_slot_task_free+0x5/0x1f0 [mvsas]
   Product: SCSI Drivers
   Version: 2.5
Kernel Version: 4.1.2
  Hardware: x86-64
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Other
  Assignee: scsi_drivers-ot...@kernel-bugs.osdl.org
  Reporter: davis...@gmail.com
Regression: No

Got this call trace, it caused any attempts to access those disks hang
(couldn't even kill those processes, eg. ls).
Using HighPoint RocketRAID 2760A controller.

kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1
kernel: sas: trying to find task 0x880213ac6a00
kernel: sas: sas_scsi_find_task: aborting task 0x880213ac6a00
kernel: BUG: unable to handle kernel NULL pointer dereference at
0010
kernel: IP: [] mvs_slot_task_free+0x5/0x1f0 [mvsas]
kernel: PGD 1ee973067 PUD 1ee974067 PMD 0
kernel: Oops:  [#1] PREEMPT SMP
kernel: Modules linked in: fuse nf_conntrack_netbios_ns nf_conntrack_broadcast
xt_tcpudp ip6t_rpfilter ip
kernel:  aesni_intel rc_core snd_hda_codec_realtek aes_x86_64 lrw gf128mul
videobuf2_dma_sg glue_helper a
kernel: CPU: 3 PID: 227 Comm: scsi_eh_7 Tainted: P   O4.1.2-2-ARCH
#1
kernel: Hardware name: Gigabyte Technology Co., Ltd.
GA-990FXA-UD3/GA-990FXA-UD3, BIOS FFe 11/08/2013
kernel: task: 88007f849e90 ti: 880223184000 task.ti: 880223184000
kernel: RIP: 0010:[]  []
mvs_slot_task_free+0x5/0x1f0 [mvsas]
kernel: RSP: 0018:880223187d00  EFLAGS: 00010a13
kernel: RAX: 2e8ba2e8ba2e8ba3 RBX: 880213ac6a00 RCX: a2e8bb8b9cb3907b
kernel: RDX:  RSI: 880213ac6a00 RDI: 88022244
kernel: RBP: 880223187d58 R08: 000a R09: 0607
kernel: R10: 000213fc R11: 0607 R12: 0005
kernel: R13: 880222a59000 R14: 88022244 R15: 880213ac6a08
kernel: FS:  7fdddc839880() GS:88022ecc()
knlGS:
kernel: CS:  0010 DS:  ES:  CR0: 8005003b
kernel: CR2: 0010 CR3: 0001ee978000 CR4: 000407e0
kernel: Stack:
kernel:  a0210bde 88020018 880223187d68 880223187d28
kernel:  a5257e12 88007f840208 0005 880223187db0
kernel:  880213ac6a08 8802230ef000 880213ac6a00 880223187e28
kernel: Call Trace:
kernel:  [] ? mvs_abort_task+0x1ce/0x230 [mvsas]
kernel:  [] sas_scsi_recover_host+0x47b/0xc20 [libsas]
kernel:  [] scsi_error_handler+0xfc/0x580 [scsi_mod]
kernel:  [] ? __schedule+0x362/0xa30
kernel:  [] ? scsi_eh_get_sense+0x190/0x190 [scsi_mod]
kernel:  [] kthread+0xd8/0xf0
kernel:  [] ? kthread_worker_fn+0x170/0x170
kernel:  [] ret_from_fork+0x42/0x70
kernel:  [] ? kthread_worker_fn+0x170/0x170
kernel: Code: 84 00 00 00 00 00 66 66 66 66 90 55 48 8b 87 b0 00 00 00 89 f6 48
89 e5 f0 48 0f b3 30 5d c
kernel: RIP  [] mvs_slot_task_free+0x5/0x1f0 [mvsas]
kernel:  RSP 
kernel: CR2: 0010
kernel: ---[ end trace 18b7a6f928680374 ]---

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101891] mvsas prep failed, NULL pointer dereference in mvs_slot_task_free+0x5/0x1f0 [mvsas]

2015-07-23 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101891

--- Comment #1 from Dāvis  ---
Some more call traces


[ cut here ]
kernel: WARNING: CPU: 4 PID: 6442 at fs/sysfs/group.c:224
sysfs_remove_group+0xa1/0xb0()
kernel: sysfs group 8189de80 not found for kobject 'end_device-8:0'
kernel: Modules linked in: fuse nf_conntrack_netbios_ns nf_conntrack_broadcast
xt_tcpudp ip6t_rpfilter
kernel:  aesni_intel rc_core snd_hda_codec_realtek aes_x86_64 lrw gf128mul
videobuf2_dma_sg glue_helper
kernel: CPU: 4 PID: 6442 Comm: kworker/u16:12 Tainted: P  R   DO   
4.1.2-2-ARCH #1
kernel: Hardware name: Gigabyte Technology Co., Ltd.
GA-990FXA-UD3/GA-990FXA-UD3, BIOS FFe 11/08/2013
kernel: Workqueue: scsi_wq_8 sas_destruct_devices [libsas]
kernel:   fff093ac 88008071bbf8 81585c8e
kernel:   88008071bc50 88008071bc38 81078c9a
kernel:  88008071bc68  8189de80 880222550810
kernel: Call Trace:
kernel:  [] dump_stack+0x4c/0x6e
kernel:  [] warn_slowpath_common+0x8a/0xc0
kernel:  [] warn_slowpath_fmt+0x55/0x70
kernel:  [] ? kernfs_find_and_get_ns+0x4c/0x60
kernel:  [] sysfs_remove_group+0xa1/0xb0
kernel:  [] dpm_sysfs_remove+0x57/0x60
kernel:  [] device_del+0x58/0x270
kernel:  [] ? put_device+0x17/0x20
kernel:  [] device_unregister+0x22/0x80
kernel:  [] bsg_unregister_queue+0x60/0xc0
kernel:  [] sas_rphy_remove+0x4c/0x80 [scsi_transport_sas]
kernel:  [] sas_rphy_delete+0x16/0x30 [scsi_transport_sas]
kernel:  [] sas_destruct_devices+0x65/0x90 [libsas]
kernel:  [] process_one_work+0x14b/0x470
kernel:  [] worker_thread+0x48/0x4c0
kernel:  [] ? process_one_work+0x470/0x470
kernel:  [] kthread+0xd8/0xf0
kernel:  [] ? kthread_worker_fn+0x170/0x170
kernel:  [] ret_from_fork+0x42/0x70
kernel:  [] ? kthread_worker_fn+0x170/0x170
kernel: ---[ end trace 18b7a6f928680375 ]---
kernel: [ cut here ]
kernel: WARNING: CPU: 4 PID: 6442 at fs/sysfs/group.c:224
sysfs_remove_group+0xa1/0xb0()
kernel: sysfs group 8189de80 not found for kobject 'end_device-8:1'
kernel: Modules linked in: fuse nf_conntrack_netbios_ns nf_conntrack_broadcast
xt_tcpudp ip6t_rpfilter ip
kernel:  aesni_intel rc_core snd_hda_codec_realtek aes_x86_64 lrw gf128mul
videobuf2_dma_sg glue_helper a
kernel: CPU: 4 PID: 6442 Comm: kworker/u16:12 Tainted: P  R   D W  O   
4.1.2-2-ARCH #1
kernel: Hardware name: Gigabyte Technology Co., Ltd.
GA-990FXA-UD3/GA-990FXA-UD3, BIOS FFe 11/08/2013
kernel: Workqueue: scsi_wq_8 sas_destruct_devices [libsas]
kernel:   fff093ac 88008071bc38 81585c8e
kernel:   88008071bc90 88008071bc78 81078c9a
kernel:  88008071bc78  8189de80 88022254c810
kernel: Call Trace:
kernel:  [] dump_stack+0x4c/0x6e
kernel:  [] warn_slowpath_common+0x8a/0xc0
kernel:  [] warn_slowpath_fmt+0x55/0x70
kernel:  [] ? kernfs_find_and_get_ns+0x4c/0x60
kernel:  [] sysfs_remove_group+0xa1/0xb0
kernel:  [] dpm_sysfs_remove+0x57/0x60
kernel:  [] device_del+0x58/0x270
kernel:  [] sas_rphy_remove+0x5c/0x80 [scsi_transport_sas]
kernel:  [] sas_rphy_delete+0x16/0x30 [scsi_transport_sas]
kernel:  [] sas_destruct_devices+0x65/0x90 [libsas]
kernel:  [] process_one_work+0x14b/0x470
kernel:  [] worker_thread+0x48/0x4c0
kernel:  [] ? process_one_work+0x470/0x470
kernel:  [] kthread+0xd8/0xf0
kernel:  [] ? kthread_worker_fn+0x170/0x170
kernel:  [] ret_from_fork+0x42/0x70
kernel:  [] ? kthread_worker_fn+0x170/0x170
kernel: ---[ end trace 18b7a6f92868037c ]---
kernel: [ cut here ]
kernel: WARNING: CPU: 4 PID: 6442 at fs/sysfs/group.c:224
sysfs_remove_group+0xa1/0xb0()
kernel: sysfs group 8189de80 not found for kobject 'end_device-8:2'
kernel: Modules linked in: fuse nf_conntrack_netbios_ns nf_conntrack_broadcast
xt_tcpudp ip6t_rpfilter ip
kernel:  aesni_intel rc_core snd_hda_codec_realtek aes_x86_64 lrw gf128mul
videobuf2_dma_sg glue_helper a
kernel: CPU: 4 PID: 6442 Comm: kworker/u16:12 Tainted: P  R   D W  O   
4.1.2-2-ARCH #1
kernel: Hardware name: Gigabyte Technology Co., Ltd.
GA-990FXA-UD3/GA-990FXA-UD3, BIOS FFe 11/08/2013
kernel: Workqueue: scsi_wq_8 sas_destruct_devices [libsas]
kernel:   fff093ac 88008071bb88 81585c8e
kernel:   88008071bbe0 88008071bbc8 81078c9a
kernel:  88008071bbc8  8189de80 88022254d838
kernel: Call Trace:
kernel:  [] dump_stack+0x4c/0x6e
kernel:  [] warn_slowpath_common+0x8a/0xc0
kernel:  [] warn_slowpath_fmt+0x55/0x70
kernel:  [] ? kernfs_find_and_get_ns+0x4c/0x60
kernel:  [] sysfs_remove_group+0xa1/0xb0
kernel:  [] dpm_sysfs_remove+0x57/0x60
kernel:  [] device_del+0x58/0x270
kernel:  [] ? device_remove_file+0x19/0x20
kernel:  [] attribute_container_class_device_del+0x1e/0x30
kernel:  [] transport_remove_classdev+0x52/0x60
kernel:  [] ? transport_add_class_device+0x40/0x40
kernel:

[Bug 101891] mvsas prep failed, NULL pointer dereference in mvs_slot_task_free+0x5/0x1f0 [mvsas]

2015-07-24 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101891

--- Comment #2 from Dāvis  ---
(In reply to Dāvis from comment #0)
> Got this call trace, it caused any attempts to access those disks hang
> (couldn't even kill those processes, eg. ls).
> Using HighPoint RocketRAID 2760A controller.
> 
> kernel: mvsas :07:00.0: mvsas prep failed[0]!
> kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1
> kernel: sas: trying to find task 0x880213ac6a00
> kernel: sas: sas_scsi_find_task: aborting task 0x880213ac6a00
> kernel: BUG: unable to handle kernel NULL pointer dereference at
> 0010
> kernel: IP: [] mvs_slot_task_free+0x5/0x1f0 [mvsas]
> kernel: PGD 1ee973067 PUD 1ee974067 PMD 0
> kernel: Oops:  [#1] PREEMPT SMP
> kernel: Modules linked in: fuse nf_conntrack_netbios_ns
> nf_conntrack_broadcast xt_tcpudp ip6t_rpfilter ip
> kernel:  aesni_intel rc_core snd_hda_codec_realtek aes_x86_64 lrw gf128mul
> videobuf2_dma_sg glue_helper a
> kernel: CPU: 3 PID: 227 Comm: scsi_eh_7 Tainted: P   O   
> 4.1.2-2-ARCH #1
> kernel: Hardware name: Gigabyte Technology Co., Ltd.
> GA-990FXA-UD3/GA-990FXA-UD3, BIOS FFe 11/08/2013
> kernel: task: 88007f849e90 ti: 880223184000 task.ti: 880223184000
> kernel: RIP: 0010:[]  []
> mvs_slot_task_free+0x5/0x1f0 [mvsas]
> kernel: RSP: 0018:880223187d00  EFLAGS: 00010a13
> kernel: RAX: 2e8ba2e8ba2e8ba3 RBX: 880213ac6a00 RCX: a2e8bb8b9cb3907b
> kernel: RDX:  RSI: 880213ac6a00 RDI: 88022244
> kernel: RBP: 880223187d58 R08: 000a R09: 0607
> kernel: R10: 000213fc R11: 0607 R12: 0005
> kernel: R13: 880222a59000 R14: 88022244 R15: 880213ac6a08
> kernel: FS:  7fdddc839880() GS:88022ecc()
> knlGS:
> kernel: CS:  0010 DS:  ES:  CR0: 8005003b
> kernel: CR2: 0010 CR3: 0001ee978000 CR4: 000407e0
> kernel: Stack:
> kernel:  a0210bde 88020018 880223187d68 880223187d28
> kernel:  a5257e12 88007f840208 0005 880223187db0
> kernel:  880213ac6a08 8802230ef000 880213ac6a00 880223187e28
> kernel: Call Trace:
> kernel:  [] ? mvs_abort_task+0x1ce/0x230 [mvsas]
> kernel:  [] sas_scsi_recover_host+0x47b/0xc20 [libsas]
> kernel:  [] scsi_error_handler+0xfc/0x580 [scsi_mod]
> kernel:  [] ? __schedule+0x362/0xa30
> kernel:  [] ? scsi_eh_get_sense+0x190/0x190 [scsi_mod]
> kernel:  [] kthread+0xd8/0xf0
> kernel:  [] ? kthread_worker_fn+0x170/0x170
> kernel:  [] ret_from_fork+0x42/0x70
> kernel:  [] ? kthread_worker_fn+0x170/0x170
> kernel: Code: 84 00 00 00 00 00 66 66 66 66 90 55 48 8b 87 b0 00 00 00 89 f6
> 48 89 e5 f0 48 0f b3 30 5d c
> kernel: RIP  [] mvs_slot_task_free+0x5/0x1f0 [mvsas]
> kernel:  RSP 
> kernel: CR2: 0010
> kernel: ---[ end trace 18b7a6f928680374 ]---

It didn't used to happen before, but now today got it again. Seems it's quite
reproducible as my usage was pretty similar, basically heavy I/O, rsync and
compiling. Also seems there's no way to get disks back but just reboot as
removing kernel modules fail (not even with force).

-- 
You are receiving this mail because:
You are watching the assignee of the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 98171] [Regression] Marvell SE91xx SATA 3 controllers not recognized correctly

2015-08-08 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=98171

Carlos Guidugli  changed:

   What|Removed |Added

 CC||guidu...@gmail.com

--- Comment #4 from Carlos Guidugli  ---
Hi,

I am also having the same issue with Fedora 22, kernel 4.1.3-200.fc22.x86_64.
Any recommendations?

[0.828869] ata9: SATA max UDMA/133 abar m2048@0xdf61 port 0xdf610100
irq 35
[0.828873] ata10: SATA max UDMA/133 abar m2048@0xdf61 port 0xdf610180
irq 35
[0.828877] ata11: SATA max UDMA/133 abar m2048@0xdf61 port 0xdf610200
irq 35
[0.828880] ata12: SATA max UDMA/133 abar m2048@0xdf61 port 0xdf610280
irq 35
[0.828884] ata13: SATA max UDMA/133 abar m2048@0xdf61 port 0xdf610300
irq 35
[0.828887] ata14: SATA max UDMA/133 abar m2048@0xdf61 port 0xdf610380
irq 35
[0.828890] ata15: SATA max UDMA/133 abar m2048@0xdf61 port 0xdf610400
irq 35
[0.828894] ata16: SATA max UDMA/133 abar m2048@0xdf61 port 0xdf610480
irq 35
[1.116331] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[1.117021] ata1.00: ATA-9: WDC WD60EFRX-68MYMN1, 82.00A82, max UDMA/133
[1.117026] ata1.00: 11721045168 sectors, multi 16: LBA48 NCQ (depth 31/32),
AA
[1.118349] ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[1.119026] ata6.00: ATA-9: WDC WD60EFRX-68MYMN1, 82.00A82, max UDMA/133
[1.119031] ata6.00: 11721045168 sectors, multi 16: LBA48 NCQ (depth 31/32),
AA
[1.119341] ata8: SATA link down (SStatus 0 SControl 300)
[1.119346] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[1.119377] ata7: SATA link down (SStatus 0 SControl 300)
[1.119394] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[1.119413] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[1.119740] ata6.00: configured for UDMA/133
[1.120073] ata5.00: ATA-9: WDC WD60EFRX-68MYMN1, 82.00A82, max UDMA/133
[1.120078] ata5.00: 11721045168 sectors, multi 16: LBA48 NCQ (depth 31/32),
AA
[1.120798] ata5.00: configured for UDMA/133
[1.134350] ata14: SATA link down (SStatus 0 SControl 300)
[1.134376] ata16: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[1.134401] ata12: SATA link down (SStatus 0 SControl 300)
[1.134498] ata16.00: ATAPI: MARVELL VIRTUALL, 1.09, max UDMA/66
[1.134652] ata16.00: configured for UDMA/66
[1.136354] ata15: SATA link down (SStatus 0 SControl 300)
[1.136379] ata10: SATA link down (SStatus 0 SControl 300)
[1.136403] ata11: SATA link down (SStatus 0 SControl 300)
[1.136425] ata9: SATA link down (SStatus 0 SControl 300)
[1.136446] ata13: SATA link down (SStatus 0 SControl 300)
[1.323489] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[1.327731] ata3.00: HPA detected: current 3907027055, native 3907029168
[1.327822] ata2.00: ATA-9: WDC WD60EFRX-68MYMN1, 82.00A82, max UDMA/133
[1.327827] ata2.00: 11721045168 sectors, multi 16: LBA48 NCQ (depth 31/32),
AA
[1.327839] ata3.00: ATA-8: WDC WD20EARS-00S8B1, 80.00A80, max UDMA/133
[1.327844] ata3.00: 3907027055 sectors, multi 16: LBA48 NCQ (depth 31/32),
AA
[1.327966] ata4.00: HPA detected: current 3907027055, native 3907029168
[1.328069] ata4.00: ATA-8: WDC WD20EARS-00S8B1, 80.00A80, max UDMA/133
[1.328073] ata4.00: 3907027055 sectors, multi 16: LBA48 NCQ (depth 31/32),
AA
[1.328281] ata1.00: configured for UDMA/133
[1.328518] scsi 0:0:0:0: Direct-Access ATA  WDC WD60EFRX-68M 0A82
PQ: 0 ANSI: 5
[1.328620] ata2.00: configured for UDMA/133
[1.328965] sd 0:0:0:0: Attached scsi generic sg0 type 0
[1.329004] sd 0:0:0:0: [sda] 11721045168 512-byte logical blocks: (6.00
TB/5.45 TiB)
[1.329009] sd 0:0:0:0: [sda] 4096-byte physical blocks
[1.329214] sd 0:0:0:0: [sda] Write Protect is off
[1.329219] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[1.329250] scsi 1:0:0:0: Direct-Access ATA  WDC WD60EFRX-68M 0A82
PQ: 0 ANSI: 5
[1.329348] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled,
doesn't support DPO or FUA
[1.329652] sd 1:0:0:0: [sdb] 11721045168 512-byte logical blocks: (6.00
TB/5.45 TiB)
[1.329657] sd 1:0:0:0: [sdb] 4096-byte physical blocks
[1.329669] sd 1:0:0:0: Attached scsi generic sg1 type 0
[1.329782] sd 1:0:0:0: [sdb] Write Protect is off
[1.329788] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[1.329880] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled,
doesn't support DPO or FUA
[1.332247] ata3.00: configured for UDMA/133
[1.332429] scsi 2:0:0:0: Direct-Access ATA  WDC WD20EARS-00S 0A80
PQ: 0 ANSI: 5
[1.332811] sd 2:0:0:0: [sdc] 3907027055 512-byte logical blocks: (2.00
TB/1.81 TiB)
[1.332851] sd 2:0:0:0: Attached scsi generic sg2 type 0
[1.332899] sd 2:0:0:0: [sdc] Write Protect is off
[1.332905] sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[1.33297

[Bug 101011] Kernel Oops when disconnecting a mounted ext4 usb stick

2015-08-10 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101011

k.kotle...@sims.pl changed:

   What|Removed |Added

 CC||h...@lst.de,
   ||linux-e...@vger.kernel.org

--- Comment #6 from k.kotle...@sims.pl ---
This bug is still present in 4.2-rc6. Reverting:

commit 08439fec266c3cc5702953b4f54bdf5649357de0
Author: Christoph Hellwig 
Date:   Thu Apr 2 23:56:32 2015 -0400

ext4: remove block_device_ejected

bdi->dev now never goes away, so this function became useless.

Signed-off-by: Christoph Hellwig 
Signed-off-by: Theodore Ts'o 

makes the oops go away, at least for me (I've tested this on v4.1.4 i386 only).
I hope this rings some bells.

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101011] Kernel Oops when disconnecting a mounted ext4 usb stick

2015-08-14 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101011

Maciej Szmigiero  changed:

   What|Removed |Added

 CC||m...@maciej.szmigiero.name

--- Comment #7 from Maciej Szmigiero  ---
I can also confirm that this bug is present in latest stable kernel (4.1.5) and
reverting commit from comment 6 seems to fix it.

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101011] Kernel Oops when disconnecting a mounted ext4 usb stick

2015-08-14 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101011

--- Comment #8 from Theodore Tso  ---
On Fri, Aug 14, 2015 at 11:02:14AM +, bugzilla-dae...@bugzilla.kernel.org
wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=101011
> 
> I can also confirm that this bug is present in latest stable kernel (4.1.5) 
> and
> reverting commit from comment 6 seems to fix it.

Christoph,

I've since gotten two reports from users that reverting your commit:
"08439fec266c3: ext4: remove block_device_ejected" fixes a crash when
a USB stick is yanked from their system.  Looking at the reported
stack dump, it looks like the crash is happening in
account_page_dirtied() when it updates some bdi-specific statistics.

I haven't been paying attention to the recent changes in how bdi gets
torn down after the device gets removed, and in fact finding the
recent changes wasn't obvioius enough after doing a brief search, but
it seems to me that if reverting this patch is making any kind of
differences, then the assertion in the commit description:

bdi->dev now never goes away, so this function became useless.

it implies that bdi->dev *does* become NULL, and checking for this is
useful.  In any case, I don't see any harm in reverting this commit;
what do you think?

Thanks,

- Ted

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101011] Kernel Oops when disconnecting a mounted ext4 usb stick

2015-08-15 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101011

--- Comment #9 from Christoph Hellwig  ---
Hi Ted,


sorry for the delay - I saw the mail Jan Cc'ed me on yesterday.  After
my changes it should not go away and I had tested the original eject
test that it indeed didn't.  Either I forgot a case, or the major
writeback Tejun did a little later regressed it.

As I won't have time to look into it ASAP I'd suggest to revert my
patch for now.  In the long run I really don't want to have these
checks spread over file system so I plan to look into it once I
get a few spare hours.

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101011] Kernel Oops when disconnecting a mounted ext4 usb stick

2015-08-16 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101011

--- Comment #10 from Theodore Tso  ---
On Sat, Aug 15, 2015 at 10:19:02AM +0200, Christoph Hellwig wrote:
> 
> sorry for the delay - I saw the mail Jan Cc'ed me on yesterday.  After
> my changes it should not go away and I had tested the original eject
> test that it indeed didn't.  Either I forgot a case, or the major
> writeback Tejun did a little later regressed it.
> 
> As I won't have time to look into it ASAP I'd suggest to revert my
> patch for now.  In the long run I really don't want to have these
> checks spread over file system so I plan to look into it once I
> get a few spare hours.

Thanks, I'll revert the patch.

I suspect we should add an ioctl to simulate a USB device unplug using
the loopback block device, so we can add a test to xfstests.

- Ted

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101891] mvsas prep failed, NULL pointer dereference in mvs_slot_task_free+0x5/0x1f0 [mvsas]

2015-08-16 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101891

--- Comment #3 from Dāvis  ---
I narrowed it down to this section of mvs_abort_task function
(drivers/scsi/mvsas/mv_sas.c)

} else if (task->task_proto & SAS_PROTOCOL_SATA ||
task->task_proto & SAS_PROTOCOL_STP) {
if (SAS_SATA_DEV == dev->dev_type) {
struct mvs_slot_info *slot = task->lldd_task;
u32 slot_idx = (u32)(slot - mvi->slot_info);
mv_dprintk("mvs_abort_task() mvi=%p task=%p "
   "slot=%p slot_idx=x%x\n",
   mvi, task, slot, slot_idx);
task->task_state_flags |= SAS_TASK_STATE_ABORTED;
mvs_slot_task_free(mvi, task, slot, slot_idx);
rc = TMF_RESP_FUNC_COMPLETE;
goto out;
}

}


Basically this line "u32 slot_idx = (u32)(slot - mvi->slot_info)".
I think (slot - mvi->slot_info) returns 0x10 and that's why
(there's no "mvs_abort_task()" in journal so it crashes before that.

kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: sas: Enter sas_scsi_recover_host busy: 1 failed: 1
kernel: sas: trying to find task 0x8801fff87500
kernel: sas: sas_scsi_find_task: aborting task 0x8801fff87500
kernel: BUG: unable to handle kernel NULL pointer dereference at
0010
kernel: IP: [] mvs_slot_task_free+0x5/0x1f0 [mvsas]
kernel: PGD 0 
kernel: Oops:  [#1] PREEMPT SMP 
kernel: Modules linked in: nls_iso8859_4 nls_cp775 vfat fat fuse nvidia(PO)
xt_CHECKSUM ipt_MASQUERADE nf_nat_masq
kernel:  serio_raw pcspkr fam15h_power snd_hda_codec_realtek snd_hda_codec_hdmi
snd_hda_codec_generic snd_hda_inte
kernel: 
kernel: CPU: 3 PID: 222 Comm: scsi_eh_7 Tainted: P   O   
4.1.5-ARCH-dirty #2
kernel: Hardware name: Gigabyte Technology Co., Ltd.
GA-990FXA-UD3/GA-990FXA-UD3, BIOS FFe 11/08/2013
kernel: task: 880222718000 ti: 88007fc9c000 task.ti: 88007fc9c000
kernel: RIP: 0010:[]  []
mvs_slot_task_free+0x5/0x1f0 [mvsas]
kernel: RSP: 0018:88007fc9fd00  EFLAGS: 00010a13
kernel: RAX: 2e8ba2e8ba2e8ba3 RBX: 8801fff87500 RCX: 45d175ba2d18107b
kernel: RDX:  RSI: 8801fff87500 RDI: 88007fb8
kernel: RBP: 88007fc9fd58 R08: 000a R09: 060d
kernel: R10: 00020cd8 R11: 060d R12: 88007fb836a0
kernel: R13: 8800ce394e00 R14: 88007fb8 R15: 8801fff87508
kernel: FS:  7f0720ffe700() GS:88022ecc()
knlGS:
kernel: CS:  0010 DS:  ES:  CR0: 8005003b
kernel: CR2: 0010 CR3: 000224182000 CR4: 000406e0
kernel: Stack:
kernel:  a017dce2 8818 88007fc9fd68 88007fc9fd28
kernel:  20e55177 88022536f208 0005 88007fc9fdb0
kernel:  8801fff87508 8800ce321000 8801fff87500 88007fc9fe28
kernel: Call Trace:
kernel:  [] ? mvs_abort_task+0x272/0x2b0 [mvsas]
kernel:  [] sas_scsi_recover_host+0x47b/0xc20 [libsas]
kernel:  [] scsi_error_handler+0xfc/0x580 [scsi_mod]
kernel:  [] ? __schedule+0x372/0xa30
kernel:  [] ? scsi_eh_get_sense+0x190/0x190 [scsi_mod]
kernel:  [] kthread+0xd8/0xf0
kernel:  [] ? kthread_worker_fn+0x170/0x170
kernel:  [] ret_from_fork+0x42/0x70
kernel:  [] ? kthread_worker_fn+0x170/0x170
Code: 84 00 00 00 00 00 66 66 66 66 90 55 48 8b 87 b0 00 00 00 89 f6 48 89 e5
f0 48 0f b3 30 5d c3 0f 1f
80 00 00 00 00 66 66 66 66 90 <48> 83 7a 10 00 0f 84 60 01 00 00 55 48
kernel: Code: 84 00 00 00 00 00 66 66 66 66 90 55 48 8b 87 b0 00 00 00 89 f6 48
89 e5 f0 48 0f b3 30 5d c3 0f 1f 8
kernel: RIP  [] mvs_slot_task_free+0x5/0x1f0 [mvsas]
kernel:  RSP 
kernel: CR2: 0010
kernel: ---[ end trace 93debf717bb54039 ]---

-- 
You are receiving this mail because:
You are watching the assignee of the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 103061] New: kernel BUG at drivers/scsi/qla2xxx/qla_isr.c:2242

2015-08-18 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=103061

Bug ID: 103061
   Summary: kernel BUG at drivers/scsi/qla2xxx/qla_isr.c:2242
   Product: SCSI Drivers
   Version: 2.5
Kernel Version: 3.10.0
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: high
  Priority: P1
 Component: QLOGIC QLA2XXX
  Assignee: scsi_drivers-qla2...@kernel-bugs.osdl.org
  Reporter: gaofeng@memblaze.com
Regression: No

one  initiator
one  target
set /sys/module/qla2xxx/parameters/ql2xmaxqdepth 512
while do io test, we meet the issue

syslog
[10343.908025] qla2xxx [:01:00.0]-801c:12: Abort command issued
nexus=12:0:0 --  1 2002.
[10344.907437] qla2xxx [:01:00.0]-801c:12: Abort command issued
nexus=12:0:0 --  1 2002.
[10345.906840] qla2xxx [:01:00.0]-801c:12: Abort command issued
nexus=12:0:0 --  1 2002.
[10346.906245] qla2xxx [:01:00.0]-801c:12: Abort command issued
nexus=12:0:0 --  1 2002.
[10399.853793] qla2xxx [:01:00.0]-801c:12: Abort command issued
nexus=12:0:0 --  1 2002.
[10406.796828] BUG: unable to handle kernel NULL pointer dereference at
0084
[10406.796953] IP: [] qla2x00_status_entry+0x3d1/0x1150
[qla2xxx]
[10406.797071] PGD 0 
[10406.797184] Oops:  [#1] SMP 
[10406.797293] Modules linked in: qla2xxx scsi_transport_fc scsi_tgt fuse btrfs
zlib_deflate raid6_pq xor msdos ext4 mbcache jbd2 binfmt_misc ipt_MASQUERADE
xt_CHECKSUM ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat
ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security
ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security
iptable_raw iptable_filter ip_tables sg vfat fat snd_hda_codec_realtek
snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep
snd_seq snd_seq_device mxm_wmi coretemp mei_me mei snd_pcm kvm_intel kvm
crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel
[10406.799038]  aesni_intel lrw gf128mul glue_helper ablk_helper cryptd
snd_page_alloc snd_timer snd soundcore wmi shpchp pcspkr serio_raw mperf
acpi_pad nfsd auth_rpcgss nfs_acl lockd uinput sunrpc xfs libcrc32c sd_mod
crc_t10dif crct10dif_common i915 ahci i2c_algo_bit libahci e1000e
drm_kms_helper libata drm ptp pps_core i2c_core video dm_mirror dm_region_hash
dm_log dm_mod [last unloaded: scsi_tgt]
[10406.800987] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 3.10.0-123.el7.x86_64
#1
[10406.801523] Hardware name: MSI MS-7915/Z97 MPOWER (MS-7915), BIOS V1.6
09/19/2014
[10406.802080] task: 880418c25b00 ti: 880418c3a000 task.ti:
880418c3a000
[10406.802657] RIP: 0010:[]  []
qla2x00_status_entry+0x3d1/0x1150 [qla2xxx]
[10406.803278] RSP: 0018:88042fb43ce0  EFLAGS: 00010046
[10406.803903] RAX:  RBX:  RCX:
a09d0780
[10406.804553] RDX:  RSI: 880415176740 RDI:
0800
[10406.805211] RBP: 88042fb43de0 R08: 0029 R09:

[10406.805882] R10: 8803bcd7a140 R11: 8803bd103100 R12:
880412c08000
[10406.806567] R13: 880415176740 R14: 0029 R15:
000e
[10406.807265] FS:  () GS:88042fb4()
knlGS:
[10406.807987] CS:  0010 DS:  ES:  CR0: 80050033
[10406.808719] CR2: 0084 CR3: 018ce000 CR4:
001407e0
[10406.809474] DR0:  DR1:  DR2:

[10406.810235] DR3:  DR6: 0ff0 DR7:
0400
[10406.811001] Stack:
[10406.811768]  00ef a09db6cf 8804

[10406.812579]   8804 00ef

[10406.813399]    
8804
[10406.814223] Call Trace:
[10406.815047]   
[10406.815056] 
[10406.815887]  [] ? qlt_async_event+0x72/0x310 [qla2xxx]
[10406.816746]  [] qla24xx_process_response_queue+0x32e/0x5c0
[qla2xxx]
[10406.817629]  [] ? enqueue_hrtimer+0x25/0x80
[10406.818522]  [] ? __hrtimer_start_range_ns+0x1ca/0x410
[10406.819433]  [] qla24xx_msix_rsp_q+0x4b/0xc0 [qla2xxx]
[10406.820356]  [] handle_irq_event_percpu+0x3e/0x1e0
[10406.821290]  [] handle_irq_event+0x3d/0x60
[10406.89]  [] handle_edge_irq+0x77/0x130
[10406.823177]  [] handle_irq+0xbf/0x150
[10406.824130]  [] ? atomic_notifier_call_chain+0x1a/0x20
[10406.825101]  [] do_IRQ+0x4f/0xf0
[10406.826078]  [] common_interrupt+0x6d/0x6d
[10406.827068]   
[10406.827076] 
[10406.828044]  [] ? cpuidle_enter_state+0x52/0xc0
[10406.829015]  [] cpuidle_idle_call+0xc5/0x200
[10406.829980]  [] arch_cpu_idle+0xe/0x30
[10406.830949]  [] cpu_startup_entry+0xf5/0x290
[10406.831901]  [] start_secondary+0x265/0x27b
[10406.83279

[Bug 101891] mvsas prep failed, NULL pointer dereference in mvs_slot_task_free+0x5/0x1f0 [mvsas]

2015-08-18 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101891

--- Comment #4 from Dāvis  ---
(In reply to Dāvis from comment #3)
> I narrowed it down to this section of mvs_abort_task function
> (drivers/scsi/mvsas/mv_sas.c)
> 
>   } else if (task->task_proto & SAS_PROTOCOL_SATA ||
>   task->task_proto & SAS_PROTOCOL_STP) {
>   if (SAS_SATA_DEV == dev->dev_type) {
>   struct mvs_slot_info *slot = task->lldd_task;
>   u32 slot_idx = (u32)(slot - mvi->slot_info);
>   mv_dprintk("mvs_abort_task() mvi=%p task=%p "
>  "slot=%p slot_idx=x%x\n",
>  mvi, task, slot, slot_idx);
>   task->task_state_flags |= SAS_TASK_STATE_ABORTED;
>   mvs_slot_task_free(mvi, task, slot, slot_idx);
>   rc = TMF_RESP_FUNC_COMPLETE;
>   goto out;
>   }
> 
>   }
> 
> 
> Basically this line "u32 slot_idx = (u32)(slot - mvi->slot_info)".
> I think (slot - mvi->slot_info) returns 0x10 and that's why
> (there's no "mvs_abort_task()" in journal so it crashes before that.
> 

Sorry for being idiot, that line doesn't cause any pointer
dereference and neither does previous line. It's just so obvious,
compiler reordered instructions so that mvs_slot_task_free is executed
before mv_dprintk is called and that's why it's not in journal.
Even as title I wrote NULL pointer dereference in mvs_slot_task_free
and that's exactly where had to look.

So anyway when in mvs_task_prep and if pci_pool_alloc fails then
task->lldd_task is NULL as can see

task->lldd_task = NULL;
slot->n_elem = n_elem;
slot->slot_tag = tag;

slot->buf = pci_pool_alloc(mvi->dma_pool, GFP_ATOMIC, &slot->buf_dma);
if (!slot->buf)
goto err_out_tag;

then later it's aborted with mvs_abort_task and there mvs_slot_task_free
is called with (slot = task->lldd_task) which is NULL and in
mvs_slot_task_free
{
if (!slot->task)
return;

happens this NULL pointer dereference because slot is NULL.

There's 2 ways to fix this, either check if slot is NULL before calling 
mvs_slot_task_free or just inside it check it.

I went for second option as it seems easier and won't have to always
check before calling.

Here's a patch, haven't tested it yet but I think it will fix this
and it's compiling right now so I'll let know once I'll have tested it.

diff --git a/drivers/scsi/mvsas/mv_sas.c b/drivers/scsi/mvsas/mv_sas.c
index 454536c..9c78074 100644
--- a/drivers/scsi/mvsas/mv_sas.c
+++ b/drivers/scsi/mvsas/mv_sas.c
@@ -887,6 +887,8 @@ static void mvs_slot_free(struct mvs_info *mvi, u32
rx_desc)
 static void mvs_slot_task_free(struct mvs_info *mvi, struct sas_task *task,
  struct mvs_slot_info *slot, u32 slot_idx)
 {
+   if (!slot)
+   return;
if (!slot->task)
return;
if (!sas_protocol_ata(task->task_proto))

-- 
You are receiving this mail because:
You are watching the assignee of the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101891] mvsas prep failed, NULL pointer dereference in mvs_slot_task_free+0x5/0x1f0 [mvsas]

2015-08-19 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101891

--- Comment #5 from Dāvis  ---
Success, patch indeed fixed it :)

Now instead of crash I get this ↓, but everything seems to be working and no
need for reboot.

kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: mvsas :07:00.0: mvsas prep failed[0]!
kernel: sas: Enter sas_scsi_recover_host busy: 19 failed: 19
kernel: sas: trying to find task 0x8801c9599100
kernel: sas: sas_scsi_find_task: aborting task 0x8801c9599100
kernel: sas: sas_scsi_find_task: task 0x8801c9599100 is aborted
kernel: sas: sas_eh_handle_sas_errors: task 0x8801c9599100 is aborted
kernel: sas: trying to find task 0x8801c9599500
kernel: sas: sas_scsi_find_task: aborting task 0x8801c9599500
kernel: sas: sas_scsi_find_task: task 0x8801c9599500 is aborted
kernel: sas: sas_eh_handle_sas_errors: task 0x8801c9599500 is aborted
kernel: sas: trying to find task 0x8801c9599900
kernel: sas: sas_scsi_find_task: aborting task 0x8801c9599900
kernel: sas: sas_scsi_find_task: task 0x8801c9599900 is aborted
kernel: sas: sas_eh_handle_sas_errors: task 0x8801c9599900 is aborted
kernel: sas: trying to find task 0x8801ba22a500
kernel: sas: sas_scsi_find_task: aborting task 0x8801ba22a500
kernel: sas: sas_scsi_find_task: task 0x8801ba22a500 is aborted
kernel: sas: sas_eh_handle_sas_errors: task 0x8801ba22a500 is aborted
kernel: sas: trying to find task 0x88000f686300
kernel: sas: sas_scsi_find_task: aborting task 0x88000f686300
kernel: sas: sas_scsi_find_task: task 0x88000f686300 is aborted
kernel: sas: sas_eh_handle_sas_errors: task 0x88000f686300 is aborted
kernel: sas: trying to find task 0x88000f687f00
kernel: sas: sas_scsi_find_task: aborting task 0x88000f687f00
kernel: sas: sas_scsi_find_task: task 0x88000f687f00 is aborted
kernel: sas: sas_eh_handle_sas_errors: task 0x88000f687f00 is aborted
kernel: sas: trying to find task 0x88000f687c00
kernel: sas: sas_scsi_find_task: aborting task 0x88000f687c00
kernel: sas: sas_scsi_find_task: task 0x88000f687c00 is aborted
kernel: sas: sas_eh_handle_sas_errors: task 0x88000f687c00 is aborted
kernel: sas: trying to find task 0x88000f686e00
kernel: sas: sas_scsi_find_task: aborting task 0x88000f686e00
kernel: sas: sas_scsi_find_task: task 0x88000f686e00 is aborted
kernel: sas: sas_eh_handle_sas_errors: task 0x88000f686e00 is aborted
kernel: sas: trying to find task 0x88000f686a00
kernel: sas: sas_scsi_find_task: aborting task 0x88000f686a00
kernel: sas: sas_scsi_find_task: task 0x88000f686a00 is aborted
kernel: sas: sas_eh_handle_sas_errors: task 0x88000f686a00 is aborted
kernel: sas: trying to find task 0x88000f687d00
kernel: sas: sas_scsi_find_task: aborting task 0x88000f687d00
kernel: sas: sas_scsi_find_task: task 0x88000f687d00 is aborted
kernel: sas: sas_eh_handle_sas_errors: task 0x88000f687d00 is aborted
kernel: sas: trying to find task 0x88000f686f00
kernel: sas: sas_scsi_find_task: aborting task 0x88000f686f00
kernel: sas: sas_scsi_find_task: task 0x88000f686f00 is aborted
kernel: sas: sas_eh_handle_sas_errors: task 0x88000f686f00 is aborted
kernel: sas: trying to find task 0x88000f687500
kernel: sas: sas_scsi_find_task: aborting task 0x88000f687500
kernel: sas: sas_scsi_find_task: task 0x88000f687500 is aborted
kernel: sas: sas_eh_handle_sas_errors: task 0x88000f687500 is aborted
kernel: sas: trying to find task 0x88000f687000
kernel: sas: sas_scsi_find_task: aborting task 0x88000f687000
kernel: sas: sas_scsi_find_task: task 0x88000f687000 is aborted
kernel: sas: sas_eh_handle_sas_errors: task 0x88000f687000 is aborted
kernel: sas: trying to find task 0x8800024f1900
kernel: sas: sas_scsi_find_task: aborting task 0x8800024f1900
kernel: sas: sas_scsi_find_task: task 0x8800024f1900 is aborted
kernel: sas: sas_eh_handle_sas_errors: task 0x8800024f1900 is aborted
kernel: sas: trying to find task 0x8800024f0d00
kernel: sas: sas_scsi_find_task: aborting task 0x8800024f0d00
kernel: sas: sas_

[Bug 101891] mvsas prep failed, NULL pointer dereference in mvs_slot_task_free+0x5/0x1f0 [mvsas]

2015-08-20 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101891

Turbo Fredriksson  changed:

   What|Removed |Added

 CC||tu...@bayour.com

--- Comment #6 from Turbo Fredriksson  ---
Forgive an ignoramus, but those last lines doesn't look to good

kernel: ata11.00: device reported invalid CHS sector 0

I have a problem that is very much like yours, but my stack traces are
different so I'm unsure if we have the same problem. I'm going to rebuild my
kernel as well with your fix and see if it helps me as well.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 101891] mvsas prep failed, NULL pointer dereference in mvs_slot_task_free+0x5/0x1f0 [mvsas]

2015-08-20 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=101891

--- Comment #7 from Dāvis  ---
(In reply to Turbo Fredriksson from comment #6)
> Forgive an ignoramus, but those last lines doesn't look to good
> 
> kernel: ata11.00: device reported invalid CHS sector 0
> 
> I have a problem that is very much like yours, but my stack traces are
> different so I'm unsure if we have the same problem. I'm going to rebuild my
> kernel as well with your fix and see if it helps me as well.

My fix is only for "NULL pointer dereference in mvs_slot_task_free" in mvsas
driver. If you use hardware with different driver then this fix won't change
anything for you. And even if you've such hardware and use this driver then you
might have hit different bug, you really should have posted stack trace, logs,
etc...

As for those other messages, I've no clue what they actually mean. But as I
understand under heavy I/O load pci_pool_alloc fails, so those tasks are
aborted and that probably prevents kernel from accessing disks and disk reset
is issued. Then it can access all disks again and everything keeps working.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 153171] scsi host6: runtime PM trying to activate child device host6 but parent (2-2:1.0) is not active

2016-10-18 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=153171

Scott Mowerson  changed:

   What|Removed |Added

 CC||smower...@gmail.com

--- Comment #10 from Scott Mowerson  ---
This is preventing me from booting Ubuntu 16.10:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1631058?comments=all 

But I am not using a low-latency kernel.

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 178381] New: Suspend to RAM test failed while CONFIG_SCSI_MQ_DEFAULT is set

2016-10-19 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=178381

Bug ID: 178381
   Summary: Suspend to RAM test failed while
CONFIG_SCSI_MQ_DEFAULT is set
   Product: SCSI Drivers
   Version: 2.5
Kernel Version: 4.8.2
  Hardware: x86-64
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Other
  Assignee: scsi_drivers-ot...@kernel-bugs.osdl.org
  Reporter: chintz...@gmail.com
Regression: No

Hi, 

OS is Fedora 24 + Linux kernel 4.8.2 with SCSI_MQ is enabled.

I executed below commands to do suspend to RAM test (S3 mode).

$ echo platform > /sys/power/pm_test;echo mem > /sys/power/state

It can resume correctly while only one disk is installed.

However, it can't resume correctly while there are multiple disks.
the system is hanging up.  


I set hung_task_timeout_secs=120 to get the below message.


[ 1280.985118] call :00:01.1+ returned 0 after 5 usecs
[ 1479.326798] INFO: task kworker/u8:2:1772 blocked for more than 120 seconds.
[ 1479.333754] Tainted: G U 4.8.2 #1
[ 1479.338718] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[ 1479.346533] kworker/u8:2 D 8802596f3bf8 0 1772 2 0x
[ 1479.353622] Workqueue: events_unbound async_run_entry_fn
[ 1479.358951] 8802596f3bf8 00ff8802596f3bd8 880250e71e00
880263dd5a00
[ 1479.366428] 88026dd19500 8802596f4000 7fff
88025f5e2260
[ 1479.373908] 880250e71e00 880260af0740 8802596f3c10
817f5cc5
[ 1479.381385] Call Trace:
[ 1479.383838] [] schedule+0x35/0x80
[ 1479.388800] [] schedule_timeout+0x2c4/0x440
[ 1479.394627] [] ? __enqueue_entity+0x6c/0x70
[ 1479.400454] [] ? enqueue_entity+0x2e8/0x8e0
[ 1479.406279] [] wait_for_completion+0xe1/0x120
[ 1479.412276] [] ? wake_up_q+0x80/0x80
[ 1479.417497] [] ? dpm_wait+0x40/0x40
[ 1479.422629] [] dpm_wait+0x32/0x40
[ 1479.427587] [] dpm_wait_fn+0x11/0x20
[ 1479.432805] [] device_for_each_child+0x50/0x90
[ 1479.438890] [] __device_suspend+0x51/0x380
[ 1479.444642] [] async_suspend+0x1f/0xa0
[ 1479.450035] [] async_run_entry_fn+0x39/0x140
[ 1479.455950] [] process_one_work+0x184/0x430
[ 1479.461776] [] worker_thread+0x4e/0x480
[ 1479.467255] [] ? process_one_work+0x430/0x430
[ 1479.473269] [] kthread+0xd8/0xf0
[ 1479.478144] [] ret_from_fork+0x1f/0x40
[ 1479.483535] [] ? kthread_worker_fn+0x180/0x180
[ 1479.489621] INFO: task md0_resync:2133 blocked for more than 120 seconds.
[ 1479.496400] Tainted: G U 4.8.2 #1
[ 1479.501358] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[ 1479.509173] md0_resync D 88025717bbc8 0 2133 2 0x
[ 1479.516274] 88025717bbc8 00ff88025717bb78 880263f69e00
880263dd5a00
[ 1479.523766] 58ec83c0 88025717c000 88025717bc48
880258662088
[ 1479.531264] 880258662070 880250a6f300 88025717bbe0
817f5cc5
[ 1479.538753] Call Trace:
[ 1479.541203] [] schedule+0x35/0x80
[ 1479.546163] [] raid1_sync_request+0x2da/0xba0 [raid1]
[ 1479.552865] [] ? prepare_to_wait_event+0xf0/0xf0
[ 1479.559122] [] md_do_sync+0x8bb/0xec0
[ 1479.564428] [] ? prepare_to_wait_event+0xf0/0xf0
[ 1479.570688] [] ? check_preempt_curr+0x7e/0x90
[ 1479.576686] [] ? kernel_sigaction+0x43/0xe0
[ 1479.582538] [] md_thread+0x139/0x150
[ 1479.587757] [] ? find_pers+0x70/0x70
[ 1479.592978] [] kthread+0xd8/0xf0
[ 1479.597863] [] ret_from_fork+0x1f/0x40
[ 1479.603254] [] ? kthread_worker_fn+0x180/0x180
[ 1479.609350] INFO: task ext4lazyinit:2167 blocked for more than 120 seconds.
[ 1479.616297] Tainted: G U 4.8.2 #1
[ 1479.621256] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[ 1479.629071] ext4lazyinit D 8802530e7be8 0 2167 2 0x
[ 1479.636161] 8802530e7be8 041a2d80 880253cf
880263dd5a00
[ 1479.643637] 00042008 8802530e8000 88026dd99500
7fff
[ 1479.651114] 880253cf 1000 8802530e7c00
817f5cc5
[ 1479.658587] Call Trace:
[ 1479.661034] [] schedule+0x35/0x80
[ 1479.665995] [] schedule_timeout+0x2c4/0x440
[ 1479.671818] [] ? md_make_request+0xf6/0x230
[ 1479.677645] [] ? ktime_get+0x41/0xb0
[ 1479.682864] [] io_schedule_timeout+0xa4/0x110
[ 1479.688863] [] wait_for_completion_io+0xe1/0x120
[ 1479.695122] [] ? wake_up_q+0x80/0x80
[ 1479.700340] [] submit_bio_wait+0x65/0x90
[ 1479.705905] [] blkdev_issue_zeroout+0x172/0x1e0
[ 1479.712073] [] ext4_init_inode_table+0x18d/0x390
[ 1479.718343] [] ext4_lazyinit_thread+0x136/0x330
[ 1479.724514] [] ? init_once+0x80/0x80
[ 1479.729732] [] kthread+0xd8/0xf0
[ 1479.734607] [] ret_from_fork+0x1f/0x40
[ 1479.73] [] ? kthread_worker_fn+0x180/0x180
[ 1479.746082] INFO: task test.sh:2189 blocked for more than 120 seconds.
[ 1479.752599] Tainted: G U 4.8.2 #1
[ 1479.757557] "echo 0 >

[Bug 179341] mpt3sas: LSISAS3008 don't see Intel 540s SSD

2016-10-20 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=179341

--- Comment #1 from Badalyan Vyacheslav  ---
I was change product and component

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 179341] mpt3sas: LSISAS3008 don't see Intel 540s SSD

2016-10-20 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=179341

--- Comment #2 from Badalyan Vyacheslav  ---
Adapter Selected is a Avago SAS: SAS3008(C0)

Num   CtlrFW VerNVDATAx86-BIOS PCI Addr


0  SAS3008(C0)  13.00.00.000b.02.00.0308.31.00.00 00:0a:00:00
1  SAS3008(C0)  13.00.00.000b.02.00.0308.31.00.00 00:08:00:00

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 176951] boot fails unless acpi=off Acer Travelmate X-349

2016-10-24 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=176951

Len Brown  changed:

   What|Removed |Added

  Component|Config-Tables   |Other
   Assignee|acpi_config-tables@kernel-b |scsi_drivers-other@kernel-b
   |ugs.osdl.org|ugs.osdl.org
Product|ACPI|SCSI Drivers

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 121531] Adaptec 7805H SAS HBA (pm80xx): hangs when writing >80MB at once

2016-10-25 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=121531

--- Comment #17 from Chloé Desoutter  ---
Created attachment 242701
  --> https://bugzilla.kernel.org/attachment.cgi?id=242701&action=edit
pm8001_defs.h patch

PM8001_MPI_QUEUE: 1024 → 512 (more stable)

-- 
You are receiving this mail because:
You are the assignee for the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 121531] Adaptec 7805H SAS HBA (pm80xx): hangs when writing >80MB at once

2016-10-25 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=121531

Chloé Desoutter  changed:

   What|Removed |Added

 CC||ch...@tigres-rouges.net

--- Comment #16 from Chloé Desoutter  ---
Hello,

I'm a victim of the bug and this impaired heavily the use of this controller on
my filer when I wrote data constantly on it. After a few hours it would freeze
without possible recovery.
I've been researching the cause of this bug by comparing the source trees of
Microsemi and what's in the kernel.

The SCSI queue depth is not at stake, as changing it did not fix the issue. In
the  microsemi driver it is at 128, in the kernel tree it's at 508 but this
changes nothing in the end (except I noticed a slight performance loss when set
at 128).

However the MPI queue parameter is set way higher in the kernel tree than in
the Microsemi tree.

In the Microsemi tree this is managed by the MAX_IB_QUEUE_ELEMENTS and
MAX_OB_QUEUE_ELEMENTS defines. The events queue seems to be split evenly
between reads and writes. The total queue length is 512. There is an equal
number of inbound and outbound queues there.

In the kernel tree, this is handled by the PM8001_MPI_QUEUE define (value:
1024). There is 1 inbound queue and 4 outbound queues.

I noticed that the value PM8001_MPI_QUEUE = 1024 causes crashes of the driver
on a "PMC-Sierra PM8001 SAS HBA" as reported earlier. Changing this value to
512 results in a much more stable driver. I guess setting the MPI queue to
something too important results in instructions being lost when too much data
gets queued and the controller cannot keep up with the writes.

I will attach the following patch.

--- linux/drivers/scsi/pm8001/pm8001_defs.h.orig2016-10-25
15:15:40.470112331 +
+++ linux/drivers/scsi/pm8001/pm8001_defs.h2016-10-24 19:13:46.533108727
+
@@ -76,7 +76,7 @@ enum port_type {

 /* driver compile-time configuration */
 #definePM8001_MAX_CCB 512/* max ccbs supported */
-#define PM8001_MPI_QUEUE 1024   /* maximum mpi queue entries */
+#define PM8001_MPI_QUEUE 512   /* maximum mpi queue entries */
 #definePM8001_MAX_INB_NUM 1
 #definePM8001_MAX_OUTB_NUM 1
 #definePM8001_MAX_SPCV_INB_NUM1

-- 
You are receiving this mail because:
You are the assignee for the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 121531] Adaptec 7805H SAS HBA (pm80xx): hangs when writing >80MB at once

2016-10-25 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=121531

--- Comment #18 from Chloé Desoutter  ---
Actually further MicroSemi driver analysis lets me think that the proper,
recommended value is 256.

$ egrep '#define\s+MAX_[IO]B_QUEUE_ELEMENTS' *.h
pm8001_sas.h:#defineMAX_IB_QUEUE_ELEMENTS   256
pm8001_sas.h:#defineMAX_OB_QUEUE_ELEMENTS   256

so I shall test with this value for PM8001_MPI_QUEUE and see if I achieve real
stability with constant workloads.

-- 
You are receiving this mail because:
You are the assignee for the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 121531] Adaptec 7805H SAS HBA (pm80xx): hangs when writing >80MB at once

2016-10-25 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=121531

--- Comment #19 from Chloé Desoutter  ---
Said parameter has been introduced by this commit.
99c72ebceb4dda445b4b74c6f46035feec95a2b3

The rationale is OK but the flaw is that sometimes the controller will crash
completely so there needs to be another way.

I suggest we set the PM8001_MPI_QUEUE to 256 and find another way to mitigate
these performance degradations because we cannot afford crashed SAS
controllers.

-- 
You are receiving this mail because:
You are the assignee for the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 121531] Adaptec 7805H SAS HBA (pm80xx): hangs when writing >80MB at once

2016-10-26 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=121531

--- Comment #20 from Chloé Desoutter  ---
Currently testing w/ PM8001_MPI_QUEUE = 256.

Prospective patch attached.

-- 
You are receiving this mail because:
You are the assignee for the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 121531] Adaptec 7805H SAS HBA (pm80xx): hangs when writing >80MB at once

2016-10-26 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=121531

Chloé Desoutter  changed:

   What|Removed |Added

 Attachment #242701|0   |1
is obsolete||

--- Comment #21 from Chloé Desoutter  ---
Created attachment 242801
  --> https://bugzilla.kernel.org/attachment.cgi?id=242801&action=edit
pm8001_defs.h patch [2]

Set PM8001_MPI_QUEUE to 256, as in the MicroSemi driver.

-- 
You are receiving this mail because:
You are the assignee for the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 85751] iSCSI initiator lockup during logout

2016-10-27 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=85751

Jaden  changed:

   What|Removed |Added

 CC||jaden1...@gmail.com

--- Comment #4 from Jaden  ---
I also encountered the similar issue, but not in logout stage. If the links are
down could also meet this issue occasionally. Below are my reproduce steps:

1.while :; do dd if=/dev/sdc of=/dev/null bs=1K count=1 iflag=direct; done
2.kill -SIGSTOP `pidof iscsid`
3.iptables -A OUTPUT -p tcp --dport 3260 -j DROP

I think it is cause by a status conflict between the waitting for lost IO
request and iscsi device remove procedure. Any new thoughs?

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 121531] Adaptec 7805H SAS HBA (pm80xx): hangs when writing >80MB at once

2016-10-27 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=121531

--- Comment #22 from Chloé Desoutter  ---
(In reply to Chloé Desoutter from comment #20)
> Currently testing w/ PM8001_MPI_QUEUE = 256.
> 
> Prospective patch attached.

I confirm that after 36 hours of intensive workload, I see no visible
performance loss on a PM8001 and that there's been no data error since then.

-- 
You are receiving this mail because:
You are the assignee for the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 172831] Adaptec ASR7805 in RAID (Expose RAW) mode recreate block devices and broke MDRAID

2016-10-28 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=172831

--- Comment #10 from Badalyan Vyacheslav  ---
Long. I sold and took a LSI

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 172841] aacraid does not support TRIM in Raw (Pass Through) devices

2016-10-28 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=172841

--- Comment #2 from Badalyan Vyacheslav  ---
Long. I sold and took a LSI

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 121531] Adaptec 7805H SAS HBA (pm80xx): hangs when writing >80MB at once

2016-11-01 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=121531

--- Comment #23 from Chloé Desoutter  ---
I can trigger crashes after a long while in a heavy workload, quite randomly,
with 256.

Out of curiosity I checked the pmspcv driver from FreeBSD and they use a lower
value still :


#define MPI_MAX_INBOUND_QUEUES  64 /**< Maximum number of inbound
queues */
#define MPI_MAX_OUTBOUND_QUEUES 64 /**< Maximum number of outbound
queues */

   /**< Max # of memory chunks
supported */
#define MPI_MAX_MEM_REGIONS (MPI_MAX_INBOUND_QUEUES +
MPI_MAX_OUTBOUND_QUEUES) + 4
#define MPI_LOGSIZE 4096  /**< default size */

so I'll try with this value and give feedback for stability and performance.

-- 
You are receiving this mail because:
You are the assignee for the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 187221] New: HPSA resetting logical / reset logical

2016-11-07 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=187221

Bug ID: 187221
   Summary: HPSA resetting logical / reset logical
   Product: IO/Storage
   Version: 2.5
Kernel Version: 4.4.x, 4.8.x
  Hardware: Intel
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: SCSI
  Assignee: linux-scsi@vger.kernel.org
  Reporter: kernel...@bof.de
Regression: No

I have about 20 HP DL 380 (some 360) servers, from Gen7 to Gen9, using the HPSA
driver with various smartarray controllers.

For a long time I've been running mainline 3.14 kernels, without any issues.
Some time ago I updated to mainline 4.4.x, up to the most recent 4.4.30.

Now I noticed, especially on one server, but in the logs on 6 of them, the
following kind of message:

2016-11-06T22:09:50.227592+01:00 HOST kernel: [68853.338610] hpsa :03:00.0:
scsi 0:1:0:0: resetting logical  Direct-Access HP   LOGICAL VOLUME  
RAID-5 SSDSmartPathCap- En- Exp=1
2016-11-06T22:10:18.713759+01:00 HOST kernel: [68881.832436] hpsa :03:00.0:
scsi 0:1:0:0: reset logical  completed successfully Direct-Access HP  
LOGICAL VOLUME   RAID-5 SSDSmartPathCap- En- Exp=1

I see such messages, _usually_ only with 1 second between resetting/reset, on
machines with the following controller+controller firmware variants:
1 P410i 5.14
1 P420i 5.42
2 P440ar 3.02
1 P440ar 3.56
1 P440ar 4.02

The one machine for which I've shown the concrete message, is a P440ar with
firmware 3.02. There, contrary to the other machines, it sometimes takes up to
20 seconds for that resetting operation, and meanwhile, all I/O stalls.

I also tested with 4.8.x kernels, and saw the same symptoms there. I'm somewhat
sure that I did not see these with 3.14 kernels. This morning I rebooted the
most problematic box to 3.14.79, so far it was silent. I'll report if that
changes.

Apart from these log lines, there is nothing strange to be found - no ILO or
IML notifications visible, no other kernel messages, no drive failures, SMART
alerts, or performance regressions...

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 187231] New: kernel panic during hpsa MSI plus tg3 MSI

2016-11-07 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=187231

Bug ID: 187231
   Summary: kernel panic during hpsa MSI plus tg3 MSI
   Product: IO/Storage
   Version: 2.5
Kernel Version: 4.8.6
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: SCSI
  Assignee: linux-scsi@vger.kernel.org
  Reporter: kernel...@bof.de
Regression: No

Created attachment 243801
  --> https://bugzilla.kernel.org/attachment.cgi?id=243801&action=edit
kernel 4.8.6 .config

I'm not sure whether this is a SCSI / HPSA bug or a networking / tg3 driver
bug. Both are seen in the stack dump. As the trigger seems to be HPSA I'm
reporting as a SCSI issue here...

I've been recently attempting to run mainline 4.8.x kernels, most recently
4.8.6, on our production HP DL 380 Intel servers.

On several of them there is some related issue reported in
https://bugzilla.kernel.org/show_bug.cgi?id=187221 where the HPSA driver on
some of the hosts sometimes resets the logical device. I had seen that already
with 4.4.x kernels, and again with 4.8.6.

Now, specifically with 4.8.6, on the box which has the worst of these symptoms,
I _additionally_ experienced multiple full kernel panics. The same box (with
the same hpsa reset symtoms) had been running 4.4.x kernels before without such
kernel panics. The panics then happened multiple times with about a day in
between.

On the last round I had the ILO SSH console running under screen with logging
enabled, and was able to retrieve the following panic backtrace:

[187283.903173] hpsa :03:00.0: scsi 0:1:0:0: resetting logical 
Direct-Access HP   LOGICAL VOLUME   RAID-5 SSDSmartPathCap- En- Exp=1   
[187314.331375] sd 0:1:0:0: rejecting I/O to offline device 
[187314.413441] sd 0:1:0:0: rejecting I/O to offline device 
[187314.854183] sd 0:1:0:0: rejecting I/O to offline device 
... lots of these ...
[187328.991285] sd 0:1:0:0: rejecting I/O to offline device 
[187328.991389] sd 0:1:0:0: rejecting I/O to offline device 
[187329.190166] sd 0:1:0:0: rejecting I/O to offline device 
[187329.271304]  88bd1a7e8000 88bd1a7be500 88bd7f483eb8
8143
493f
[187329.271304] Call Trace: 
[187329.271310]
[187329.271310]  [] ? tg3_poll_msix+0xc2/0x160 [tg3]  
[187329.271311]  [] do_hpsa_intr_msi+0x8f/0x1c0   
[187329.271314]  [] __handle_irq_event_percpu+0x66/0xe0   
[187329.271315]  [] handle_irq_event_percpu+0x1e/0x50 
[187329.271316]  [] handle_irq_event+0x27/0x50
[187329.271318]  [] handle_edge_irq+0x65/0x140
[187329.271320]  [] handle_irq+0x15/0x20  
[187329.271321]  [] do_IRQ+0x46/0xd0  
[187329.271324]  [] common_interrupt+0x7c/0x7c
[187329.271325]
[187329.271338] Code: 53 48 89 fb 48 83 ec 28 4c 8b a7 5c 02 00 00 4c 8b bf 40
0
2 00 00 4c 8b b7 38 02 00 00 4c 8b af 4c 02 00 00 49 8b 04 24 4c 89 e7 <48> 8b
8
0 98 00 00 00 48 89 45 c0 49 8b 87 d0 01 00 00 48 89 45 
[187329.271339] RIP  [] complete_scsi_command+0x37/0x8c0  
[187329.271339]  RSP  
[187329.271339] CR2: 0098   
[187329.271341] ---[ end trace 52898916f0da5c53 ]---
[187329.273413] Kernel panic - not syncing: Fatal exception in interrupt
[187330.308465] Shutting down cpus with NMI 
[187330.308471] Kernel Offset: disabled 
[187330.919173] Rebooting in 300 seconds..  

I'll attach my kernel .config.

As this is a production system and so far the panics only hit with our usual
(webserver and DB kvm machine) production load active, there's not much testing
or bisecting I can do, but I didn't want to drop the issue unreported, either. 

Hope this helps somebody. If there is any more info I can provide, just ask
what would be useful.

(I'm back to running 4.4.x)

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 187231] kernel panic during hpsa MSI plus tg3 MSI

2016-11-07 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=187231

Don  changed:

   What|Removed |Added

 CC||don.br...@microsemi.com

--- Comment #1 from Don  ---
Created attachment 243811
  --> https://bugzilla.kernel.org/attachment.cgi?id=243811&action=edit
Patch to correct resets

I will be uploading this patch to linux-scsi this week.

I am attaching the patch in case you would like to test this patch now.

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 187231] kernel panic during hpsa MSI plus tg3 MSI

2016-11-07 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=187231

--- Comment #2 from Patrick Schaaf  ---
Thanks Don for the reaction!

Right now, on the box that had that panic and the worst resetting/reset issues
(see the other bug I linked), I'm back to 3.14.79, and want to stay there for
another 24 to 36 hours, to see that this issue was not present with that kernel
series.

What would your patch help with? Specifically the panic potential in case a
logical device reset is ongoing? Or should it affect / remedy the mysterious
(to me) "resetting logical" events in the first place?

I'm willing to test patches on that box starting Thursday, but I'd like to
understand a bit better what we are dealing with here.

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 187231] kernel panic during hpsa MSI plus tg3 MSI

2016-11-07 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=187231

--- Comment #3 from Don  ---

(In reply to Patrick Schaaf from comment #2)
> Thanks Don for the reaction!
> 
> Right now, on the box that had that panic and the worst resetting/reset
> issues (see the other bug I linked), I'm back to 3.14.79, and want to stay
> there for another 24 to 36 hours, to see that this issue was not present
> with that kernel series.
> 
> What would your patch help with? Specifically the panic potential in case a
> logical device reset is ongoing? Or should it affect / remedy the mysterious
> (to me) "resetting logical" events in the first place?
> 
> I'm willing to test patches on that box starting Thursday, but I'd like to
> understand a bit better what we are dealing with here.

The specific issue that this patch addresses is that during a reset,
complete_scsi_command returns without having called scsi_done which causes the
OS to offline the disk (after two more occurrences). But this code path is not
often followed so the issue does not happen with all resets.

There are some other recent patches that should also be tested that have been
recently applied.

>From git format-patch:
0457-scsi-hpsa-Check-for-null-device-pointers.patch
* This checks for a NULL device that can happen if the OS
  off-lines the disk because of the afore mentioned reset issue.
0460-scsi-hpsa-Check-for-null-devices-in-ioaccel-submissi.patch
0462-scsi-hpsa-correct-call-to-hpsa_do_reset.patch
* Fine tunes resets into LOGICAL/Physical resets.

A patch I still have pending on linux-scsi
0464-hpsa-add-generate-controller-NMI-on-lockup.patch
* This patch just adds more granularity on lock-up detection.

It would be nice to know why the reset is happening in the first place.

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 151631] "Synchronizing SCSI cache" fails during(and delays) reboot/shutdown

2016-11-08 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=151631

Gianpaolo  changed:

   What|Removed |Added

 CC||gianpao...@gmail.com

--- Comment #5 from Gianpaolo  ---
Created attachment 243891
  --> https://bugzilla.kernel.org/attachment.cgi?id=243891&action=edit
Patch reverting commit 2c85025c75dfe7ddc2bb33363a998dad59383f94

This patch solved bug https://bugzilla.kernel.org/show_bug.cgi?id=187061 which
I suspect being a duuplicate of this bug. If someone affected by this bug could
test it and confirm it works (it should work both with v4.8.6 and v4.9-rc4).

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 187231] kernel panic during hpsa MSI plus tg3 MSI

2016-11-08 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=187231

--- Comment #4 from Patrick Schaaf  ---
That problematic box, which showed the kernel panic with 4.8.6, and the
resetting/reset-up-to-20-seconds pauses several times a day with both 4.8 and
4.4.x, has now been running on 3.14.79 (with the same kvm load as before), for
30 hours, without any such HPSA resetting symptoms, or untoward pauses in the
VMs that I could otherwise notice in monitoring.

So somehow 3.14 does not trigger these episodes, or so it seems.

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 66611] Very Slow I/O performance on SAS1064

2016-11-09 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=66611

Szőgyényi Gábor  changed:

   What|Removed |Added

 CC||szg0...@freemail.hu

--- Comment #3 from Szőgyényi Gábor  ---
Please try this bug with latest kernel image.

-- 
You are receiving this mail because:
You are the assignee for the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 187381] New: 4.9.-rc4 produces hundreds of unusable scsi devices.

2016-11-09 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=187381

Bug ID: 187381
   Summary: 4.9.-rc4 produces hundreds of unusable scsi devices.
   Product: SCSI Drivers
   Version: 2.5
Kernel Version: 4.9-rc4
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Other
  Assignee: scsi_drivers-ot...@kernel-bugs.osdl.org
  Reporter: samuel.silb...@hds.com
Regression: No

Created attachment 244091
  --> https://bugzilla.kernel.org/attachment.cgi?id=244091&action=edit
Debian's kernel log file.  Note that this contains functional boots with older
kernels as well earlier in the file.

With both my hand compiled 4.9-rc4 from kernel.org sources and Ubuntu's
4.9.0-040900rc4-generic the kernel creates hundred of useless /dev/sd??
devices.  This system is fine with various 4.4, and 4.8 kernels.

The logs are spammed with messages like these
2016-11-09T12:17:00.021765-08:00 Node001 kernel: [75590.503155] Dev sdjm:
unable to read RDB block 1
2016-11-09T12:17:00.021767-08:00 Node001 kernel: [75590.503188]  sdjm: unable
to read partition table
2016-11-09T12:17:00.021769-08:00 Node001 kernel: [75590.503191] sdjm: partition
table beyond EOD, enabling native capacity
2016-11-09T12:17:00.021776-08:00 Node001 kernel: [75590.504463] sd 1:3:126:0:
[sdjl] Sector size 0 reported, assuming 512.
2016-11-09T12:17:00.021778-08:00 Node001 kernel: [75590.504486] sd 1:3:127:0:
[sdjm] Sector size 0 reported, assuming 512.
2016-11-09T12:17:00.025711-08:00 Node001 kernel: [75590.504624] Dev sdjl:
unable to read RDB block 1
2016-11-09T12:17:00.025720-08:00 Node001 kernel: [75590.504625] Dev sdjm:
unable to read RDB block 1
2016-11-09T12:17:00.025722-08:00 Node001 kernel: [75590.504635]  sdjm: unable
to read partition table
2016-11-09T12:17:00.025724-08:00 Node001 kernel: [75590.504687] sdjm: partition
table beyond EOD, truncated
2016-11-09T12:17:00.025725-08:00 Node001 kernel: [75590.504697]  sdjl: unable
to read partition table
2016-11-09T12:17:00.025727-08:00 Node001 kernel: [75590.504701] sdjl: partition
table beyond EOD, truncated
2016-11-09T12:17:00.025728-08:00 Node001 kernel: [75590.505084] sd 1:3:127:0:
[sdjm] Sector size 0 reported, assuming 512.
2016-11-09T12:17:00.025730-08:00 Node001 kernel: [75590.505088] sd 1:3:126:0:
[sdjl] Sector size 0 reported, assuming 512.
2016-11-09T12:17:00.025732-08:00 Node001 kernel: [75590.505115] sd 1:3:126:0:
[sdjl] Attached SCSI disk
2016-11-09T12:17:00.025733-08:00 Node001 kernel: [75590.505151] sd 1:3:127:0:
[sdjm] Attached SCSI disk

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 187381] 4.9.-rc4 produces hundreds of unusable scsi devices.

2016-11-09 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=187381

--- Comment #1 from Samuel Flory Silbory  ---
Created attachment 244101
  --> https://bugzilla.kernel.org/attachment.cgi?id=244101&action=edit
Storcli output from system.

Note that this occurs with really old firmware and the current megaraid
firmware.  The megaraid configuration is 12 one drive RAID zero arrays.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 187381] 4.9.-rc4 produces hundreds of unusable scsi devices.

2016-11-09 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=187381

--- Comment #2 from Samuel Flory Silbory  ---
Certainly I'll give it a try.

From: James Bottomley [james.bottom...@hansenpartnership.com]
Sent: Wednesday, November 09, 2016 12:49 PM
To: bugzilla-dae...@bugzilla.kernel.org; linux-scsi@vger.kernel.org
Cc: Samuel Silbory
Subject: Re: [Bug 187381] New: 4.9.-rc4 produces hundreds of unusable scsi
devices.

On Wed, 2016-11-09 at 20:20 +, bugzilla-dae...@bugzilla.kernel.org
wrote:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__bugzilla.kernel.org_show-5Fbug.cgi-3Fid-3D187381&d=CwICaQ&c=DZ-EF4pZfxGSU6MfABwx0g&r=4FOU_7YnNtQNKWSdicj67DITRyD0V0NPxX2P7qBxsr8&m=qqYheGXoM4mBPxyZZ7HM1rugSFukVAVfEjl1ifhaoqU&s=67orQXH1IekXCew60eHDOmksAqiZIU69q0eGzoqiWMM&e=
>
> Bug ID: 187381
>Summary: 4.9.-rc4 produces hundreds of unusable scsi
> devices.
>Product: SCSI Drivers
>Version: 2.5
> Kernel Version: 4.9-rc4
>   Hardware: All
> OS: Linux
>   Tree: Mainline
> Status: NEW
>   Severity: normal
>   Priority: P1
>  Component: Other
>   Assignee: scsi_drivers-ot...@kernel-bugs.osdl.org
>   Reporter: samuel.silb...@hds.com
> Regression: No
>
> Created attachment 244091
>   -->
> https://urldefense.proofpoint.com/v2/url?u=https-3A__bugzilla.kernel.org_attachment.cgi-3Fid-3D244091-26action-3Dedit&d=CwICaQ&c=DZ-EF4pZfxGSU6MfABwx0g&r=4FOU_7YnNtQNKWSdicj67DITRyD0V0NPxX2P7qBxsr8&m=qqYheGXoM4mBPxyZZ7HM1rugSFukVAVfEjl1ifhaoqU&s=ypXf8FCNK77zZPf1gXuZlzQpRkGQWJ_WEGmWjqnXTOI&e=
> Debian's kernel log file.  Note that this contains functional boots
> with older
> kernels as well earlier in the file.
>
> With both my hand compiled 4.9-rc4 from kernel.org sources and
> Ubuntu's
> 4.9.0-040900rc4-generic the kernel creates hundred of useless
> /dev/sd??
> devices.  This system is fine with various 4.4, and 4.8 kernels.
>
> The logs are spammed with messages like these
> 2016-11-09T12:17:00.021765-08:00 Node001 kernel: [75590.503155] Dev
> sdjm:
> unable to read RDB block 1
> 2016-11-09T12:17:00.021767-08:00 Node001 kernel: [75590.503188]
>  sdjm: unable
> to read partition table
> 2016-11-09T12:17:00.021769-08:00 Node001 kernel: [75590.503191] sdjm:
> partition
> table beyond EOD, enabling native capacity
> 2016-11-09T12:17:00.021776-08:00 Node001 kernel: [75590.504463] sd
> 1:3:126:0:

This was reported to the mailing list and should be fixed by this:

https://urldefense.proofpoint.com/v2/url?u=http-3A__marc.info_-3Fl-3Dlinux-2Dscsi-26m-3D147868920429684&d=CwICaQ&c=DZ-EF4pZfxGSU6MfABwx0g&r=4FOU_7YnNtQNKWSdicj67DITRyD0V0NPxX2P7qBxsr8&m=qqYheGXoM4mBPxyZZ7HM1rugSFukVAVfEjl1ifhaoqU&s=KCePcDLsNpEqob1iEoGGCb5Qf-smnMYk4XsK62Sp0h4&e=

We'll fast track this, but can you verify it fixes your issue.

Thanks,

James

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 187381] 4.9.-rc4 produces hundreds of unusable scsi devices.

2016-11-09 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=187381

--- Comment #3 from Samuel Flory Silbory  ---
Seems sane now.
root@pcsnode:~# uname -a
Linux pcsnode 4.9.0-rc4-1-default #1 SMP Wed Nov 9 13:59:28 PST 2016 x86_64
GNU/Linux
root@pcsnode:~# ls /dev/sd*
/dev/sda  /dev/sdc  /dev/sde  /dev/sdg/dev/sdi  /dev/sdk   /dev/sdk2   
/dev/sdk4  /dev/sdk6  /dev/sdl
/dev/sdb  /dev/sdd  /dev/sdf  /dev/sdh/dev/sdj  /dev/sdk1  /dev/sdk3   
/dev/sdk5  /dev/sdk7  /dev/sdm
root@pcsnode:~# 

The only difference compared to 4.8/4/4 is the boot drive on the SATA
controller ended up sdk instead of sdm (the last drive).  It's not an issue for
me as I mount things by UUID


From: Samuel Silbory
Sent: Wednesday, November 09, 2016 1:19 PM
To: James Bottomley; bugzilla-dae...@bugzilla.kernel.org;
linux-scsi@vger.kernel.org
Subject: RE: [Bug 187381] New: 4.9.-rc4 produces hundreds of unusable scsi
devices.

Certainly I'll give it a try.

From: James Bottomley [james.bottom...@hansenpartnership.com]
Sent: Wednesday, November 09, 2016 12:49 PM
To: bugzilla-dae...@bugzilla.kernel.org; linux-scsi@vger.kernel.org
Cc: Samuel Silbory
Subject: Re: [Bug 187381] New: 4.9.-rc4 produces hundreds of unusable scsi
devices.

On Wed, 2016-11-09 at 20:20 +, bugzilla-dae...@bugzilla.kernel.org
wrote:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__bugzilla.kernel.org_show-5Fbug.cgi-3Fid-3D187381&d=CwICaQ&c=DZ-EF4pZfxGSU6MfABwx0g&r=4FOU_7YnNtQNKWSdicj67DITRyD0V0NPxX2P7qBxsr8&m=qqYheGXoM4mBPxyZZ7HM1rugSFukVAVfEjl1ifhaoqU&s=67orQXH1IekXCew60eHDOmksAqiZIU69q0eGzoqiWMM&e=
>
> Bug ID: 187381
>Summary: 4.9.-rc4 produces hundreds of unusable scsi
> devices.
>Product: SCSI Drivers
>Version: 2.5
> Kernel Version: 4.9-rc4
>   Hardware: All
> OS: Linux
>   Tree: Mainline
> Status: NEW
>   Severity: normal
>   Priority: P1
>  Component: Other
>   Assignee: scsi_drivers-ot...@kernel-bugs.osdl.org
>   Reporter: samuel.silb...@hds.com
> Regression: No
>
> Created attachment 244091
>   -->
> https://urldefense.proofpoint.com/v2/url?u=https-3A__bugzilla.kernel.org_attachment.cgi-3Fid-3D244091-26action-3Dedit&d=CwICaQ&c=DZ-EF4pZfxGSU6MfABwx0g&r=4FOU_7YnNtQNKWSdicj67DITRyD0V0NPxX2P7qBxsr8&m=qqYheGXoM4mBPxyZZ7HM1rugSFukVAVfEjl1ifhaoqU&s=ypXf8FCNK77zZPf1gXuZlzQpRkGQWJ_WEGmWjqnXTOI&e=
> Debian's kernel log file.  Note that this contains functional boots
> with older
> kernels as well earlier in the file.
>
> With both my hand compiled 4.9-rc4 from kernel.org sources and
> Ubuntu's
> 4.9.0-040900rc4-generic the kernel creates hundred of useless
> /dev/sd??
> devices.  This system is fine with various 4.4, and 4.8 kernels.
>
> The logs are spammed with messages like these
> 2016-11-09T12:17:00.021765-08:00 Node001 kernel: [75590.503155] Dev
> sdjm:
> unable to read RDB block 1
> 2016-11-09T12:17:00.021767-08:00 Node001 kernel: [75590.503188]
>  sdjm: unable
> to read partition table
> 2016-11-09T12:17:00.021769-08:00 Node001 kernel: [75590.503191] sdjm:
> partition
> table beyond EOD, enabling native capacity
> 2016-11-09T12:17:00.021776-08:00 Node001 kernel: [75590.504463] sd
> 1:3:126:0:

This was reported to the mailing list and should be fixed by this:

https://urldefense.proofpoint.com/v2/url?u=http-3A__marc.info_-3Fl-3Dlinux-2Dscsi-26m-3D147868920429684&d=CwICaQ&c=DZ-EF4pZfxGSU6MfABwx0g&r=4FOU_7YnNtQNKWSdicj67DITRyD0V0NPxX2P7qBxsr8&m=qqYheGXoM4mBPxyZZ7HM1rugSFukVAVfEjl1ifhaoqU&s=KCePcDLsNpEqob1iEoGGCb5Qf-smnMYk4XsK62Sp0h4&e=

We'll fast track this, but can you verify it fixes your issue.

Thanks,

James

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 151631] "Synchronizing SCSI cache" fails during(and delays) reboot/shutdown

2016-11-09 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=151631

Daniele Viganò  changed:

   What|Removed |Added

 CC||dennyvatw...@gmail.com

--- Comment #6 from Daniele Viganò  ---
I can confirm exactly the same bug on my Dell Latitude E5450 with an SSD
(Samsung 840) running Fedora 24 and kernel 4.8.6 (4.7.9 was fine).

The patch provided by Gianpaolo solved the issue on my machine.

-- 
You are receiving this mail because:
You are the assignee for the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 187381] 4.9.-rc4 produces hundreds of unusable scsi devices.

2016-11-10 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=187381

Tommy Wu  changed:

   What|Removed |Added

 CC||wu.to...@gmail.com

--- Comment #4 from Tommy Wu  ---
I got same issue today.
After apply the patch, everything work fine now.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 187231] kernel panic during hpsa MSI plus tg3 MSI

2016-11-11 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=187231

--- Comment #5 from Patrick Schaaf  ---
After almost 4 days my problematic box downgraded to 3.14.79, finally made some
noise, like this:

2016-11-11T03:31:10.608539+01:00 kvm3f kernel: [320020.727691] hpsa
:03:00.0: Abort request on C0:B0:T0:L0
2016-11-11T03:31:10.608555+01:00 kvm3f kernel: [320020.728175] hpsa
:03:00.0: cp 8868f2c17000 is reported invalid (probably means target
device no longer present)
2016-11-11T03:31:10.608557+01:00 kvm3f kernel: [320020.728796] hpsa
:03:00.0: cp 8868f2c17000 is reported invalid (probably means target
device no longer present)
2016-11-11T03:31:10.608558+01:00 kvm3f kernel: [320020.729389] hpsa
:03:00.0: FAILED abort on device C0:B0:T0:L0
2016-11-11T03:31:10.608560+01:00 kvm3f kernel: [320020.729708] hpsa
:03:00.0: resetting device 0:0:0:0
2016-11-11T03:31:26.968534+01:00 kvm3f kernel: [320037.081397] hpsa
:03:00.0: device is ready.

So, maybe there is a somewhat weirdly faulty drive in that array, which
otherwise does not show any (SMART / ILO logs) symptoms...

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 187231] kernel panic during hpsa MSI plus tg3 MSI

2016-11-15 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=187231

Patrick Schaaf  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |UNREPRODUCIBLE

--- Comment #6 from Patrick Schaaf  ---
After several more such Abort request / reset sequences with 3.14.79, two days
ago the box _finally_ announced that one of its 8 drives has a SMART
"predictive failure"; after swapping that drive for a spare, the symptoms are
no longer seen.

This is the third or fourth time, over the last year, that I've seen Gen9
servers with P440ar cards behave that way.

Anyway, my immediate test case is gone, so I'll close this as RESOLVED /
unreproducible...

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 187221] HPSA resetting logical / reset logical

2016-11-15 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=187221

--- Comment #1 from Patrick Schaaf  ---
Some more info on my problematic machine / further diagnosing is in
https://bugzilla.kernel.org/show_bug.cgi?id=187231

Summary: at least with the P440ar controllers, such 10-30 second "logical
reset" episodes eventually reveal an underlying faulty drive, and go away when
that is drive is replaced.

But there is no up-front information in the "logical reset" that would permit
pinpointing the drive on the first round.

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 188061] New: On quad port QLE2564 can't add in target only 2 ports

2016-11-17 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=188061

Bug ID: 188061
   Summary: On quad port QLE2564 can't add in target only 2 ports
   Product: SCSI Drivers
   Version: 2.5
Kernel Version: 4.8.6-201.fc24.x86_64
  Hardware: x86-64
OS: Linux
  Tree: Fedora
Status: NEW
  Severity: normal
  Priority: P1
 Component: QLOGIC QLA2XXX
  Assignee: scsi_drivers-qla2...@kernel-bugs.osdl.org
  Reporter: anthony.blood...@gmail.com
Regression: No

I have QLE2564 (quad port) with portnames:
09:00.0 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI
Express HBA (rev 02)
09:00.1 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI
Express HBA (rev 02)
0a:00.0 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI
Express HBA (rev 02)
0a:00.1 Fibre Channel: QLogic Corp. ISP2532-based 8Gb Fibre Channel to PCI
Express HBA (rev 02)

systool -c fc_host -v | grep port_name
port_name   = "0x15b0024ffaa536f"
port_name   = "0x115b0024ffaa536f"
port_name   = "0x215b0024ffaa536f"
port_name   = "0x315b0024ffaa536f"
Same info from targetcli:
/qla2xxx> info 
Fabric module name: qla2xxx
ConfigFS path: /sys/kernel/config/target/qla2xxx
Allowed WWN types: naa
Allowed WWNs list: naa.215b0024ffaa536f, naa.15b0024ffaa536f,
naa.315b0024ffaa536f, naa.115b0024ffaa536f
Fabric module features: acls
Corresponding kernel module: tcm_qla2xxx

2 NAA I have succefully added:
naa.115b0024ffaa536f
naa.215b0024ffaa536f

but other two fails:
/qla2xxx> create naa.315b0024ffaa536f 
WWN not valid as: naa

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 179341] mpt3sas: LSISAS3008 don't see Intel 540s SSD

2016-11-19 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=179341

--- Comment #3 from Badalyan Vyacheslav  ---
Created attachment 245171
  --> https://bugzilla.kernel.org/attachment.cgi?id=245171&action=edit
Dmes wiyh loh

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 179341] mpt3sas: LSISAS3008 don't see Intel 540s SSD

2016-11-20 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=179341

--- Comment #4 from Badalyan Vyacheslav  ---
I was attach log in BZ

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 188061] On quad port QLE2564 can't add in target only 2 ports

2016-11-20 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=188061

himanshu.madh...@cavium.com  changed:

   What|Removed |Added

 CC||himanshu.madh...@qlogic.com

--- Comment #1 from himanshu.madh...@cavium.com  
---
Hello, 

What is version of targetcli version used here?. Also can you provide message
file when this error is seen.

Thanks,
-Himanshu

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 188061] On quad port QLE2564 can't add in target only 2 ports

2016-11-21 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=188061

--- Comment #2 from Anthony  ---
targetcli-2.1.fb43-2.fc25.noarch
python3-rtslib-2.1.fb60-2.fc25.noarch

I found possible reason and ugly workaround:
in package RTSLib file utils.py change REGEXP to validate WWN

'naa': lambda wwn: re.match("naa\.[1235][0-9a-fA-F]{15}$", wwn),

but I can't understnad why my QLogic card uses strange port names:
0x15b0024ffaa536f and 0x315b0024ffaa536f
and why on quad ports card from port to port changes highest bit in portname

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 188061] On quad port QLE2564 can't add in target only 2 ports

2016-11-21 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=188061

--- Comment #3 from himanshu.madh...@cavium.com  
---
I would not know that part but from drive point of view any valid WWN should be
fine.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 188061] On quad port QLE2564 can't add in target only 2 ports

2016-11-21 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=188061

--- Comment #4 from Anthony  ---
>From rtslib (naa wwn must start with 1 or 2 or 5):

def normalize_wwn(wwn_types, wwn):
'''
Take a WWN as given by the user and convert it to a standard text
representation.

Returns (normalized_wwn, wwn_type), or exception if invalid wwn.
'''
wwn_test = {
'free': lambda wwn: True,
'iqn': lambda wwn: \
re.match("iqn\.[0-9]{4}-[0-1][0-9]\..*\..*", wwn) \
and not re.search(' ', wwn) \
and not re.search('_', wwn),
'naa': lambda wwn: re.match("naa\.[125][0-9a-fA-F]{15}$", wwn),
'eui': lambda wwn: re.match("eui\.[0-9a-f]{16}$", wwn),
'ib': lambda wwn: re.match("ib\.[0-9a-f]{32}$", wwn),
'unit_serial': lambda wwn: \
re.match("[0-9A-Fa-f]{8}(-[0-9A-Fa-f]{4}){3}-[0-9A-Fa-f]{12}$", wwn),
}

for wwn_type in wwn_types:
clean_wwn = _cleanse_wwn(wwn_type, wwn)
found_type = wwn_test[wwn_type](clean_wwn)
if found_type:
break
else:
raise RTSLibError("WWN not valid as: %s" % ", ".join(wwn_types))

return (clean_wwn, wwn_type)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 179341] mpt3sas: LSISAS3008 don't see Intel 540s SSD

2016-11-21 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=179341

--- Comment #5 from Badalyan Vyacheslav  ---
   13.434720] mpt3sas_cm0: device status change: (internal device reset)
handle(0x0009), sas address(0x44332211), tag(65535)

[   13.464810] scsi 0:0:1:0: tag#0 CDB: Inquiry 12 00 00 00 24 00
[   13.464844] mpt3sas_cm0: sas_address(0x44332211), phy(0)
[   13.464846] mpt3sas_cm0:
enclosure_logical_id(0x500062b201056c80),slot(3)
[   13.464847] mpt3sas_cm0: enclosure level(0x), connector name(
^E)
[   13.464849] mpt3sas_cm0: handle(0x0009), ioc_status(scsi ioc
terminated)(0x004b), smid(1)
[   13.464850] mpt3sas_cm0: Device Status Change
[   13.464852] mpt3sas_cm0: request_len(36), underflow(0), resid(36)
[   13.464854] mpt3sas_cm0: tag(0), transfer_count(0),
sc->result(0x000b)
[   13.464855] mpt3sas_cm0: scsi_status(good)(0x00), scsi_state(state
terminated no status )(0x0c)
[   13.464861] mpt3sas_cm0: log_info(0x3000): originator(PL), code(0x11),
sub_code(0x1000)
[   13.464867] mpt3sas_cm0: device status change: (internal device reset
complete)
handle(0x0009), sas address(0x44332211), tag(65535)

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 179341] mpt3sas: LSISAS3008 don't see Intel 540s SSD

2016-11-21 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=179341

Badalyan Vyacheslav  changed:

   What|Removed |Added

 Kernel Version|4.8.2-1.el7.elrepo.x86_64   |4.8.9-1.el7.elrepo.x86_64

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 179341] mpt3sas: LSISAS3008 don't see Intel 540s SSD

2016-11-22 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=179341

Chaitra P B  changed:

   What|Removed |Added

 CC||chaitra.basa...@broadcom.co
   ||m

--- Comment #6 from Chaitra P B  ---
Badalyan,
 From attached driver logs, I could see FW is issuing internal device reset
many times for the drive with handle (0x0009) and also Inquiry command for this
particular drive is getting terminated.

There are many reasons for which Fimware sends internal device reset. 
And hence we need Firmware logs to find out reason for internal device reset,
also Inquiry command is getting terminated because it is issued during device
reset is in progress.

Please share Firmware logs to debug/analyse this issue further.

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 179341] mpt3sas: LSISAS3008 don't see Intel 540s SSD

2016-11-22 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=179341

--- Comment #7 from Badalyan Vyacheslav  ---
Help is needed. How to get a log FW? 

This is "SAS 9300-16i Host Bus Adapter" and not have any HW RAID. 

[root@nas sas3ircu_linux_x64_rel]# ./sas3ircu 0 LOGIR
Avago Technologies SAS3 IR Configuration Utility.
Version 14.00.00.00 (2016.07.21)
Copyright (c) 2009-2016 Avago Technologies. All rights reserved.

SAS3IRCU: The LOGIR command is not supported by the firmware currently loaded
on controller 0.

[root@nas sas3ircu_linux_x64_rel]# ./sas3ircu list
Avago Technologies SAS3 IR Configuration Utility.
Version 14.00.00.00 (2016.07.21)
Copyright (c) 2009-2016 Avago Technologies. All rights reserved.


 Adapter  Vendor  Device   SubSys  SubSys
 IndexType  ID  IDPci Address  Ven ID  Dev ID
 -    --  --  ---  --
   0 SAS3008   1000h   97h00h:0ah:00h:00h  1000h   3130h

 Adapter  Vendor  Device   SubSys  SubSys
 IndexType  ID  IDPci Address  Ven ID  Dev ID
 -    --  --  ---  --
   1 SAS3008   1000h   97h00h:08h:00h:00h  1000h   3130h
SAS3IRCU: Utility Completed Successfully.

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 179341] mpt3sas: LSISAS3008 don't see Intel 540s SSD

2016-11-22 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=179341

--- Comment #8 from Chaitra P B  ---
(In reply to Badalyan Vyacheslav from comment #7)
> Help is needed. How to get a log FW? 
> 
> This is "SAS 9300-16i Host Bus Adapter" and not have any HW RAID. 
> 
> [root@nas sas3ircu_linux_x64_rel]# ./sas3ircu 0 LOGIR
> Avago Technologies SAS3 IR Configuration Utility.
> Version 14.00.00.00 (2016.07.21)
> Copyright (c) 2009-2016 Avago Technologies. All rights reserved.
> 
> SAS3IRCU: The LOGIR command is not supported by the firmware currently
> loaded on controller 0.
> 
> [root@nas sas3ircu_linux_x64_rel]# ./sas3ircu list
> Avago Technologies SAS3 IR Configuration Utility.
> Version 14.00.00.00 (2016.07.21)
> Copyright (c) 2009-2016 Avago Technologies. All rights reserved.
> 
> 
>  Adapter  Vendor  Device   SubSys  SubSys
>  IndexType  ID  IDPci Address  Ven ID  Dev ID
>  -    --  --  ---  --
>0 SAS3008   1000h   97h00h:0ah:00h:00h  1000h   3130h
> 
>  Adapter  Vendor  Device   SubSys  SubSys
>  IndexType  ID  IDPci Address  Ven ID  Dev ID
>  -    --  --  ---  --
>1 SAS3008   1000h   97h00h:08h:00h:00h  1000h   3130h
> SAS3IRCU: Utility Completed Successfully.

Badalyan,

Connect UART to controller, then open Teraterm window & use below commands: 
"iop show diag" -> outputs ring buffer and trace buffer.
"pl dbg" -> outputs other debug info.

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 188681] New: Function csio_hw_flash_erase_sectors() does not return correct error codes on failures

2016-11-25 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=188681

Bug ID: 188681
   Summary: Function csio_hw_flash_erase_sectors() does not return
correct error codes on failures
   Product: SCSI Drivers
   Version: 2.5
Kernel Version: linux-4.9-rc6
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Other
  Assignee: scsi_drivers-ot...@kernel-bugs.osdl.org
  Reporter: bianpan2...@ruc.edu.cn
Regression: No

>From the usages of function csio_hw_flash_erase_sectors() defined in
drivers/scsi/csiostor/csio_hw.c, we can infer that it should return a non-zero
error code when something goes wrong. However, it will return 0 even the calls
to csio_hw_sf1_write() or csio_hw_flash_wait_op() fails. Maybe it is better to
use "return ret;" instead of "return 0;" at line 616. Codes related to this bug
are summarised as follows.

csio_hw_flash_erase_sectors @@ drivers/scsi/csiostor/csio_hw.c
 589 static int
 590 csio_hw_flash_erase_sectors(struct csio_hw *hw, int32_t start, int32_t
end)
 591 {
 592 int ret = 0;
 593 
 594 while (start <= end) {
 595 
 596 ret = csio_hw_sf1_write(hw, 1, 0, 1, SF_WR_ENABLE);
 597 if (ret != 0)
 598 goto out;
 599 
 600 ret = csio_hw_sf1_write(hw, 4, 0, 1,
 601 SF_ERASE_SECTOR | (start << 8));
 602 if (ret != 0)
 603 goto out;
 604 
 605 ret = csio_hw_flash_wait_op(hw, 14, 500);
 606 if (ret != 0)
 607 goto out;
 608 
 609 start++;
 610 }
 611 out:
 612 if (ret)
 613 csio_err(hw, "erase of flash sector %d failed, error %d\n",
 614  start, ret);
 615 csio_wr_reg32(hw, 0, SF_OP_A);/* unlock SF */
 616 return 0;  // return ret?
 617 }

csio_hw_fw_dload @@ drivers/scsi/csiostor/csio_hw.c
 667 static int
 668 csio_hw_fw_dload(struct csio_hw *hw, uint8_t *fw_data, uint32_t size)
 669 {
 ...
 719 ret = csio_hw_flash_erase_sectors(hw, FLASH_FW_START_SEC,
 720   FLASH_FW_START_SEC + i - 1);
 721 if (ret) {   // check the return value of
csio_hw_flash_erase_sectors()
 722 csio_err(hw, "Flash Erase failed\n");
 723 goto out;
 724 }
 ...
 755 out:
 756 if (ret)
 757 csio_err(hw, "firmware download failed, error %d\n", ret);
 758 return ret;
 759 }

Thanks very much!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 188851] New: Function twa_probe() does not set error codes on failures

2016-11-25 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=188851

Bug ID: 188851
   Summary: Function twa_probe() does not set error codes on
failures
   Product: SCSI Drivers
   Version: 2.5
Kernel Version: linux-4.9-rc6
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Other
  Assignee: scsi_drivers-ot...@kernel-bugs.osdl.org
  Reporter: bianpan2...@ruc.edu.cn
Regression: No

In function twa_probe(), variable retval takes the error code. However, the
error code is not set on some failures. I am not sure whether it is the
intention of the author. I list 3 positions that seems anomalous. (1) error
code "-ENODEV" is not assigned to retval when the call to
twa_initialize_device_extension() (at line 2041) fails; (2) error code
"-ENOMEM" is not assigned to retval when the call to ioremap() (at line 2062)
fails; (3) error code "-ENODEV" is not assigned to retval when the call to
twa_reset_sequence() (at line 2072) fails. Codes related to these bugs are
summarised as follows.

twa_probe @@ drivers/scsi/3w-9xxx.c
2003 /* This function will probe and initialize a card */
2004 static int twa_probe(struct pci_dev *pdev, const struct pci_device_id
*dev_id)
2005 {
2006 struct Scsi_Host *host = NULL;
2007 TW_Device_Extension *tw_dev;
2008 unsigned long mem_addr, mem_len;
2009 int retval = -ENODEV;
2010 
2011 retval = pci_enable_device(pdev);
2012 if (retval) {
2013 TW_PRINTK(host, TW_DRIVER, 0x34, "Failed to enable pci device");
2014 goto out_disable_device;
2015 }
 ...
2041 if (twa_initialize_device_extension(tw_dev)) {
2042 TW_PRINTK(tw_dev->host, TW_DRIVER, 0x25, "Failed to initialize
device extension");
 // (1) The value of retval is 0. Insert "retval = -ENODEV;" here?
2043 goto out_free_device_extension;
2044 }
2045 
2046 /* Request IO regions */
2047 retval = pci_request_regions(pdev, "3w-9xxx");
2048 if (retval) {
2049 TW_PRINTK(tw_dev->host, TW_DRIVER, 0x26, "Failed to get mem
region");
2050 goto out_free_device_extension;
2051 }
 ...
2062 tw_dev->base_addr = ioremap(mem_addr, mem_len);
2063 if (!tw_dev->base_addr) {
2064 TW_PRINTK(tw_dev->host, TW_DRIVER, 0x35, "Failed to ioremap");
 // (2) The value of retval is 0. Insert "retval = -ENOMEM;" here?
2065 goto out_release_mem_region;
2066 }
2067 
2068 /* Disable interrupts on the card */
2069 TW_DISABLE_INTERRUPTS(tw_dev);
2070 
2071 /* Initialize the card */
2072 if (twa_reset_sequence(tw_dev, 0))
 // (3) The value of retval is 0. Insert "retval = -ENODEV;" here?
2073 goto out_iounmap;
 ...
2133 return 0;
2134 
2135 out_remove_host:
2136 if (test_bit(TW_USING_MSI, &tw_dev->flags))
2137 pci_disable_msi(pdev);
2138 scsi_remove_host(host);
2139 out_iounmap:
2140 iounmap(tw_dev->base_addr);
2141 out_release_mem_region:
2142 pci_release_regions(pdev);
2143 out_free_device_extension:
2144 twa_free_device_extension(tw_dev);
2145 scsi_host_put(host);
2146 out_disable_device:
2147 pci_disable_device(pdev);
2148 
2149 return retval;
2150 } /* End twa_probe() */

Thanks very much!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 188861] New: Function csio_config_device_caps() does not set error codes on failures

2016-11-25 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=188861

Bug ID: 188861
   Summary: Function csio_config_device_caps() does not set error
codes on failures
   Product: SCSI Drivers
   Version: 2.5
Kernel Version: linux-4.9-rc6
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Other
  Assignee: scsi_drivers-ot...@kernel-bugs.osdl.org
  Reporter: bianpan2...@ruc.edu.cn
Regression: No

The function csio_config_device_caps() defined in file
drivers/scsi/csiostor/csio_hw.c returns the value of variable rv at the end. By
reviewing the source code of the caller of csio_config_device_caps(), we can
infer that variable rv should takes a non-zero value on failures. However,
after the check of variable rv at line 1376, its value must be 0. As a result,
0 will be returned even if the subsequent calls to csio_mb_issue() (at line
1389) or csio_mb_fw_retval() (at line 1394) fails. I guess letting rv receives
the return value of csio_hw_validate_caps() at line 1375 may be a typo. Does
the author means "retval" instead of "rv"? Codes related to this bug are
summarised as follows.

csio_config_device_caps @@ drivers/scsi/csiostor/csio_hw.c
1347 static int
1348 csio_config_device_caps(struct csio_hw *hw)
1349 {
1350 struct csio_mb  *mbp;
1351 enum fw_retval retval;
1352 int rv = -EINVAL;
1353 
1354 mbp = mempool_alloc(hw->mb_mempool, GFP_ATOMIC);
1355 if (!mbp) {
1356 CSIO_INC_STATS(hw, n_err_nomem);
1357 return -ENOMEM;
1358 }
 ...
1374 /* Validate device capabilities */
1375 rv = csio_hw_validate_caps(hw, mbp); // use "retval" instead of "rv"?
1376 if (rv != 0)
1377 goto out;
1378 
1379 /* Don't config device capabilities if already configured */
1380 if (hw->fw_state == CSIO_DEV_STATE_INIT) {
1381 rv = 0;
1382 goto out;
1383 }
1384 
1385 /* Write back desired device capabilities */
1386 csio_mb_caps_config(hw, mbp, CSIO_MB_DEFAULT_TMO, true, true,
1387 false, true, NULL);
1388 
1389 if (csio_mb_issue(hw, mbp)) {
1390 csio_err(hw, "Issue of FW_CAPS_CONFIG_CMD(w) failed!\n");
 // on this error, the return value is 0
1391 goto out;
1392 }
1393 
1394 retval = csio_mb_fw_retval(mbp);
1395 if (retval != FW_SUCCESS) {
1396 csio_err(hw, "FW_CAPS_CONFIG_CMD(w) returned %d!\n", retval);
 // on this error, the return value is 0
1397 goto out;
1398 }
1399 
1400 rv = 0;
1401 out:
1402 mempool_free(mbp, hw->mb_mempool);
1403 return rv;
1404 }

Thanks very much!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 188941] New: Function beiscsi_create_cqs() may return improper value when the call to pci_alloc_consistent() fails, which may result in use-after-free

2016-11-25 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=188941

Bug ID: 188941
   Summary: Function beiscsi_create_cqs() may return improper
value when the call to pci_alloc_consistent() fails,
which may result in use-after-free
   Product: SCSI Drivers
   Version: 2.5
Kernel Version: linux-4.9-rc6
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Other
  Assignee: scsi_drivers-ot...@kernel-bugs.osdl.org
  Reporter: bianpan2...@ruc.edu.cn
Regression: No

Function pci_alloc_consistent() returns a NULL pointer if there is no enough
memory. In function beiscsi_create_cqs() defined in file
drivers/scsi/be2iscsi/be_main.c, function pci_alloc_consistent() is called and
its return value is checked against NULL (at line 3116). If the return value is
NULL, the control flow will jump to label "create_cq_error", frees allocated
memory and returns variable ret. Because after the first execution of the loop
the value of ret must be 0 (see the check statement of ret at line 3129), the
return value will be 0 (indicates success) if pci_alloc_consistent() fails
during the second or after repeats of the loop body. In this case, the freed
memory may be used or freed again in the callers of beiscsi_create_cqs(). I
think it is better to assign "-ENOMEM" when the call pci_alloc_consistent()
fails. Codes and comments related to this bug are summarised as follows.

beiscsi_create_cqs @@ drivers/scsi/be2iscsi/be_main.c
3092 static int beiscsi_create_cqs(struct beiscsi_hba *phba,
3093  struct hwi_context_memory *phwi_context)
3094 {
 ...
3100 int ret = -ENOMEM;
 ...
3106 for (i = 0; i < phba->num_cpus; i++) {
3107 cq = &phwi_context->be_cq[i];
3108 eq = &phwi_context->be_eq[i].q;
3109 pbe_eq = &phwi_context->be_eq[i];
3110 pbe_eq->cq = cq;
3111 pbe_eq->phba = phba;
3112 mem = &cq->dma_mem;
3113 cq_vaddress = pci_alloc_consistent(phba->pcidev,
3114num_cq_pages * PAGE_SIZE,
3115&paddr);
3116 if (!cq_vaddress)
 // ret may takes value 0. Add "ret = -ENOMEM" here?
3117 goto create_cq_error;
 ...
3129 ret = beiscsi_cmd_cq_create(&phba->ctrl, cq, eq, false,
3130 false, 0);
3131 if (ret) {
3132 beiscsi_log(phba, KERN_ERR, BEISCSI_LOG_INIT,
3133 "BM_%d : beiscsi_cmd_eq_create"
3134 "Failed for ISCSI CQ\n");
3135 goto create_cq_error;
3136 }
3137 beiscsi_log(phba, KERN_INFO, BEISCSI_LOG_INIT,
3138 "BM_%d : iscsi cq_id is %d for eq_id %d\n"
3139 "iSCSI CQ CREATED\n", cq->id, eq->id);
3140 }
3141 return 0;
3142 
3143 create_cq_error:
3144 for (i = 0; i < phba->num_cpus; i++) {
3145 cq = &phwi_context->be_cq[i];
3146 mem = &cq->dma_mem;
3147 if (mem->va)
3148 pci_free_consistent(phba->pcidev, num_cq_pages
3149 * PAGE_SIZE,
3150 mem->va, mem->dma);
3151 }
3152 return ret;
3153 }

Thanks very much!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 189001] New: Function twl_probe() does not set error codes on some failures

2016-11-25 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=189001

Bug ID: 189001
   Summary: Function twl_probe() does not set error codes on some
failures
   Product: SCSI Drivers
   Version: 2.5
Kernel Version: linux-4.9-rc6
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Other
  Assignee: scsi_drivers-ot...@kernel-bugs.osdl.org
  Reporter: bianpan2...@ruc.edu.cn
Regression: No

In function twl_probe() defined in file drivers/scsi/3w-sas.c, because variable
retval is checked at line 1608, its value must be 0 when pci_iomap() is called
(at line 1614). If pci_iomap() returns a NULL pointer, the control flow jumps
to label "out_release_mem_region", cleans and returns the value of retval (i.e.
0). As a result, function twl_probe() returns 0 (indicates success) even if
there are errors. The behavior of its caller may be misled.
There are other 2 similar bugs when the function calls fail at lines 1601 and
1624. Though these errors may occur rarely, I think it is better to set the
correct error codes on failures. Codes are summarised as follows.

twl_probe @@ drivers/scsi/3w-sas.c
1564 static int twl_probe(struct pci_dev *pdev, const struct pci_device_id
*dev_id)
1565 {
1566 struct Scsi_Host *host = NULL;
1567 TW_Device_Extension *tw_dev;
1568 int retval = -ENODEV;
1569 int *ptr_phycount, phycount=0;
1570 
1571 retval = pci_enable_device(pdev);
1572 if (retval) {
1573 TW_PRINTK(host, TW_DRIVER, 0x17, "Failed to enable pci device");
1574 goto out_disable_device;
1575 }
 ...
1601 if (twl_initialize_device_extension(tw_dev)) {
1602 TW_PRINTK(tw_dev->host, TW_DRIVER, 0x1a, "Failed to initialize
device extension");
 // Bug (1): retval takes value 0. Insert "retval = -ENODEV;"?
1603 goto out_free_device_extension;
1604 }
1605 
1606 /* Request IO regions */
1607 retval = pci_request_regions(pdev, "3w-sas");
1608 if (retval) {
1609 TW_PRINTK(tw_dev->host, TW_DRIVER, 0x1b, "Failed to get mem
region");
1610 goto out_free_device_extension;
1611 }
1612 
1613 /* Save base address, use region 1 */
1614 tw_dev->base_addr = pci_iomap(pdev, 1, 0);
1615 if (!tw_dev->base_addr) {
1616 TW_PRINTK(tw_dev->host, TW_DRIVER, 0x1c, "Failed to ioremap");
 // Bug (2): retval takes value 0. Insert "retval = -ENOMEM;"?
1617 goto out_release_mem_region;
1618 }
1619 
1620 /* Disable interrupts on the card */
1621 TWL_MASK_INTERRUPTS(tw_dev);
1622 
1623 /* Initialize the card */
1624 if (twl_reset_sequence(tw_dev, 0)) {
1625 TW_PRINTK(tw_dev->host, TW_DRIVER, 0x1d, "Controller reset failed
during probe");
 // Bug (3): retval takes value 0. Insert "retval = -ENODEV;"?
1626 goto out_iounmap;
1627 }

Thanks very much!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 188961] New: Function mvs_task_prep() returns improper values on failures

2016-11-25 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=188961

Bug ID: 188961
   Summary: Function mvs_task_prep() returns improper values on
failures
   Product: SCSI Drivers
   Version: 2.5
Kernel Version: linux-4.9-rc6
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Other
  Assignee: scsi_drivers-ot...@kernel-bugs.osdl.org
  Reporter: bianpan2...@ruc.edu.cn
Regression: No

The function mvs_task_prep() defined in file drivers/scsi/mvsas/mv_sas.c
returns 0 on success, or non-zero values on failures. It calls function
pci_pool_alloc() and checks its return value against NULL (at line 794), and if
the return value is NULL, the control flow jumps to label "err_out_tag", cleans
allocated memory and returns variable rc. Function pci_pool_alloc() is called
after the check of variable rc, so the value of rc must be 0. As a result,
mvs_task_prep() will return 0 (indicates success) even the call to
pci_pool_alloc() fails. I think it is better to assign "-ENOMEM" to rc when
pci_pool_alloc() fails. Codes and comments related to this bug are summarised
as follows.

mvs_task_prep @@ drivers/scsi/mvsas/mv_sas.c
 711 static int mvs_task_prep(struct sas_task *task, struct mvs_info *mvi, int
is_tmf,
 712 struct mvs_tmf_task *tmf, int *pass)
 713 {
 ...
 719 int rc = 0;
 ...
 783 rc = mvs_tag_alloc(mvi, &tag);
 784 if (rc)
 785 goto err_out;
 786 
 787 slot = &mvi->slot_info[tag];
 788 
 789 task->lldd_task = NULL;
 790 slot->n_elem = n_elem;
 791 slot->slot_tag = tag;
 792 
 793 slot->buf = pci_pool_alloc(mvi->dma_pool, GFP_ATOMIC, &slot->buf_dma);
 794 if (!slot->buf)
 // insert "rc = -ENOMEM" here?
 795 goto err_out_tag;
 ...
 838 return rc;
 839 
 840 err_out_slot_buf:
 841 pci_pool_free(mvi->dma_pool, slot->buf, slot->buf_dma);
 842 err_out_tag:
 843 mvs_tag_free(mvi, tag);
 844 err_out:
 845 
 846 dev_printk(KERN_ERR, mvi->dev, "mvsas prep failed[%d]!\n", rc);
 847 if (!sas_protocol_ata(task->task_proto))
 848 if (n_elem)
 849 dma_unmap_sg(mvi->dev, task->scatter, n_elem,
 850  task->data_dir);
 851 prep_out:
 852 return rc;
 853 }

Thanks very much!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 188951] New: Function beiscsi_create_eqs() may return improper value when the call to pci_alloc_consistent() fails, which may result in use-after-free

2016-11-25 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=188951

Bug ID: 188951
   Summary: Function beiscsi_create_eqs() may return improper
value when the call to pci_alloc_consistent() fails,
which may result in use-after-free
   Product: SCSI Drivers
   Version: 2.5
Kernel Version: linux-4.9-rc6
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Other
  Assignee: scsi_drivers-ot...@kernel-bugs.osdl.org
  Reporter: bianpan2...@ruc.edu.cn
Regression: No

Function pci_alloc_consistent() returns a NULL pointer if there is no enough
memory. In function beiscsi_create_eqs() defined in file
drivers/scsi/be2iscsi/be_main.c, function pci_alloc_consistent() is called and
its return value is checked against NULL (at line 3052). If the return value is
NULL, the control flow will jump to label "create_cq_error", frees allocated
memory and returns variable ret. Because after the first execution of the loop
the value of ret must be 0 (see the check statement of ret at line 3067), the
return value will be 0 (indicates success) if pci_alloc_consistent() fails
during the second or after repeats of the loop body. In this case, the freed
memory may be used or freed again in the callers of beiscsi_create_eqs(). I
think it is better to assign "-ENOMEM" when the call pci_alloc_consistent()
fails. Codes and comments related to this bug are summarised as follows.

beiscsi_create_eqs @@ drivers/scsi/be2iscsi/be_main.c
3028 static int beiscsi_create_eqs(struct beiscsi_hba *phba,
3029  struct hwi_context_memory *phwi_context)
3030 {
3031 int ret = -ENOMEM, eq_for_mcc;
 ...
3045 for (i = 0; i < (phba->num_cpus + eq_for_mcc); i++) {
3046 eq = &phwi_context->be_eq[i].q;
3047 mem = &eq->dma_mem;
3048 phwi_context->be_eq[i].phba = phba;
3049 eq_vaddress = pci_alloc_consistent(phba->pcidev,
3050num_eq_pages * PAGE_SIZE,
3051&paddr);
3052 if (!eq_vaddress)
 // ret may takes value 0. Add "ret = -ENOMEM" here?
3053 goto create_eq_error;

3065 ret = beiscsi_cmd_eq_create(&phba->ctrl, eq,
3066 phwi_context->cur_eqd);
3067 if (ret) {
3068 beiscsi_log(phba, KERN_ERR, BEISCSI_LOG_INIT,
3069 "BM_%d : beiscsi_cmd_eq_create"
3070 "Failed for EQ\n");
3071 goto create_eq_error;
3072 }
3073 
3074 beiscsi_log(phba, KERN_INFO, BEISCSI_LOG_INIT,
3075 "BM_%d : eqid = %d\n",
3076 phwi_context->be_eq[i].q.id);
3077 }
3078 return 0;
3079 
3080 create_eq_error:
3081 for (i = 0; i < (phba->num_cpus + eq_for_mcc); i++) {
3082 eq = &phwi_context->be_eq[i].q;
3083 mem = &eq->dma_mem;
3084 if (mem->va)
3085 pci_free_consistent(phba->pcidev, num_eq_pages
3086 * PAGE_SIZE,
3087 mem->va, mem->dma);
3088 }
3089 return ret;
3090 }

Thanks very much!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 189061] New: Function snic_probe() does not set set code when the call to mempool_create_slab_pool() fails

2016-11-25 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=189061

Bug ID: 189061
   Summary: Function snic_probe() does not set set code when the
call to mempool_create_slab_pool() fails
   Product: SCSI Drivers
   Version: 2.5
Kernel Version: linux-4.9-rc6
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Other
  Assignee: scsi_drivers-ot...@kernel-bugs.osdl.org
  Reporter: bianpan2...@ruc.edu.cn
Regression: No

In the function snic_probe() defined in file drivers/scsi/snic/snic_main.c,
when the call to mempool_create_slab_pool() (at line 589) returns a NULL
pointer, the control flow jumps to label "err_free_res", and returns variable
ret. Because variable ret is checked at line 714, the value of ret must be 0
here. As a result, function snic_probe() returns 0 (indicates success) even if
the call to mempool_create_slab_pool() fails.
There are other 2 similar bugs when the call to mempool_create_slab_pool() fail
at lines 599 and 609. Though these errors may occur rarely, I think it may be
better to set correct error codes (e.g. -ENOMEM) on failures. Codes related to
these bugs are summarised as follows.

snic_probe @@ drivers/scsi/snic/snic_main.c
 360 static int
 361 snic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 362 {
 ...
 368 int ret, i;
 ...
 561 ret = snic_alloc_vnic_res(snic);
 562 if (ret) {
 563 SNIC_HOST_ERR(shost,
 564   "Failed to alloc vNIC resources aborting. %d\n",
 565   ret);
 566 
 567 goto err_clear_intr;
 568 }
 ...
 // Insert "ret = -ENOMEM;" ?
 589 pool = mempool_create_slab_pool(2,
 590 snic_glob->req_cache[SNIC_REQ_CACHE_DFLT_SGL]);
 591 if (!pool) {
 592 SNIC_HOST_ERR(shost, "dflt sgl pool creation failed\n");
 593 
 // Bug (1): the value of ret is 0
 594 goto err_free_res;
 595 }
 596 
 597 snic->req_pool[SNIC_REQ_CACHE_DFLT_SGL] = pool;
 598 
 599 pool = mempool_create_slab_pool(2,
 600 snic_glob->req_cache[SNIC_REQ_CACHE_MAX_SGL]);
 601 if (!pool) {
 602 SNIC_HOST_ERR(shost, "max sgl pool creation failed\n");
 603 
 // Bug (2): the value of ret is 0
 604 goto err_free_dflt_sgl_pool;
 605 }
 606 
 607 snic->req_pool[SNIC_REQ_CACHE_MAX_SGL] = pool;
 608 
 609 pool = mempool_create_slab_pool(2,
 610 snic_glob->req_cache[SNIC_REQ_TM_CACHE]);
 611 if (!pool) {
 612 SNIC_HOST_ERR(shost, "snic tmreq info pool creation failed.\n");
 613 
 // Bug (3): the value of ret is 0
 614 goto err_free_max_sgl_pool;
 615 }
 ...
 701 return 0;
 ...
 733 err_free_max_sgl_pool:
 734 mempool_destroy(snic->req_pool[SNIC_REQ_CACHE_MAX_SGL]);
 735 
 736 err_free_dflt_sgl_pool:
 737 mempool_destroy(snic->req_pool[SNIC_REQ_CACHE_DFLT_SGL]);
 738 
 739 err_free_res:
 740 snic_free_vnic_res(snic);
 741 
 742 err_clear_intr:
 743 snic_clear_intr_mode(snic);
 744 
 ...
 772 return ret;
 773 } /* end of snic_probe */

Thanks very much!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 189051] New: Function fnic_probe() does not set set code when the call to mempool_create_slab_pool() fails

2016-11-25 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=189051

Bug ID: 189051
   Summary: Function fnic_probe() does not set set code when the
call to mempool_create_slab_pool() fails
   Product: SCSI Drivers
   Version: 2.5
Kernel Version: linux-4.9-rc6
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Other
  Assignee: scsi_drivers-ot...@kernel-bugs.osdl.org
  Reporter: bianpan2...@ruc.edu.cn
Regression: No

In the function fnic_probe() defined in file drivers/scsi/fnic/fnic_main.c,
when the call to mempool_create_slab_pool() (at line 738) returns a NULL
pointer, the control flow jumps to label "err_out_free_resources", and returns
variable err. Because variable err is checked at line 714, the value of err
must be 0 here. As a result, function fnic_probe() returns 0 (indicates
success) even if the call to mempool_create_slab_pool() fails.
There are other 2 similar bugs when the call to mempool_create_slab_pool() fail
at lines 742 and 747. Though these errors may occur rarely, I think it may be
better to set correct error codes (e.g. -ENOMEM) on failures. Codes related to
these bugs are summarised as follows.

fnic_probe @@ drivers/scsi/fnic/fnic_main.c
 541 static int fnic_probe(struct pci_dev *pdev, const struct pci_device_id
*ent)
 542 {
 ...
 547 int err;
 ...
 713 err = fnic_alloc_vnic_resources(fnic);
 714 if (err) {
 715 shost_printk(KERN_ERR, fnic->lport->host,
 716  "Failed to alloc vNIC resources, "
 717  "aborting.\n");
 718 goto err_out_clear_intr;
 719 }
 ...
 // Insert "err = -ENOMEM;" ?
 738 fnic->io_req_pool = mempool_create_slab_pool(2, fnic_io_req_cache);
 739 if (!fnic->io_req_pool)
 // Bug (1): the value of err is 0
 740 goto err_out_free_resources;
 741 
 742 pool = mempool_create_slab_pool(2,
fnic_sgl_cache[FNIC_SGL_CACHE_DFLT]);
 743 if (!pool)
 // Bug (2): the value of err is 0
 744 goto err_out_free_ioreq_pool;
 745 fnic->io_sgl_pool[FNIC_SGL_CACHE_DFLT] = pool;
 746 
 747 pool = mempool_create_slab_pool(2,
fnic_sgl_cache[FNIC_SGL_CACHE_MAX]);
 748 if (!pool)
 // Bug (3): the value of err is 0
 749 goto err_out_free_dflt_pool;
 ...
 901 return 0;
 ...
 914 err_out_free_dflt_pool:
 915 mempool_destroy(fnic->io_sgl_pool[FNIC_SGL_CACHE_DFLT]);
 916 err_out_free_ioreq_pool:
 917 mempool_destroy(fnic->io_req_pool);
 918 err_out_free_resources:
 919 fnic_free_vnic_resources(fnic);
 920 err_out_clear_intr:
 921 fnic_clear_intr_mode(fnic);
 922 err_out_dev_close:
 923 vnic_dev_close(fnic->vdev);
 924 err_out_vnic_unregister:
 925 vnic_dev_unregister(fnic->vdev);
 926 err_out_iounmap:
 927 fnic_iounmap(fnic);
 928 err_out_release_regions:
 929 pci_release_regions(pdev);
 930 err_out_disable_device:
 931 pci_disable_device(pdev);
 932 err_out_free_hba:
 933 fnic_stats_debugfs_remove(fnic);
 934 scsi_host_put(lp->host);
 935 err_out:
 936 return err;
 937 }

Thanks very much!

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 176951] boot fails unless acpi=off Acer Travelmate X-349

2016-12-02 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=176951

mus@gmail.com changed:

   What|Removed |Added

 CC||mus@gmail.com

--- Comment #6 from mus@gmail.com ---
I can confirm this issue on my Acer Swift 3 Model SF314-51-59RF. Arch Linux x64
with kernel 4.8.11 and 4.9.rc7 hang with a black screen on boot.

However, I can also confirm some reports from this acer thread that this seems
to be a x64 specific issue:

https://community.acer.com/t5/Swift-Spin-S-and-R-Series/Ubuntu-on-Swift-3-SF314-51-74FW-black-screen-after-menu-on-Live/td-p/464481/highlight/true/page/3

I'm typing this from a perfectly fine working Fedora 25 32-Bit Live system with
kernel 4.8.6. The x64 version hangs on boot just like Arch Linux.

btw, there is no BIOS update available for the Swift 3 yet (BIOS version 1.05).

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 151631] "Synchronizing SCSI cache" fails during(and delays) reboot/shutdown

2016-12-04 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=151631

Rich  changed:

   What|Removed |Added

 CC||f...@bitservices.org.uk

--- Comment #7 from Rich  ---
Very similar (if not same) problem happening on my works PC - a Dell Precision
T1700. I will try and get a screenshot and add it to this thread.

The problem began with Kernel 4.8.6 and is still present (currently running
Kernel 4.8.11).

Very annoying on shutdown. I have absolutely no idea if the disks' cache is
actually flushing or not - so data integrity is a concern. Its a magnetic disk
and I have been running fsck on it regularly (no issues found as of yet) and
been avoiding shutting the PC down.

Point of this comment: Its still an issue on some PCs. Anyone else finding the
same?

-- 
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 151631] "Synchronizing SCSI cache" fails during(and delays) reboot/shutdown

2016-12-04 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=151631

--- Comment #8 from Daniele Viganò  ---
@Rich, have a look at https://bugzilla.kernel.org/show_bug.cgi?id=187061

Bug has been resolved it 4.9-rc7.

-- 
You are receiving this mail because:
You are the assignee for the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 176951] boot fails unless acpi=off Acer Travelmate X-349

2016-12-14 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=176951

--- Comment #7 from mus@gmail.com ---
Update:

It seems like this was partly fixed by BIOS upgrade 1.07 (only available via
Windows Update).

Fedora x64 and Ubuntu x64 are confirmed to boot after the BIOS upgrade (see the
mentioned acer community thread).

However, Arch Linux x64 still doesn't boot and I haven't figured out the
difference yet. Arch has kernel 4.8.11 while Fedora boots fine with 4.8.6 and
4.8.13, so I doubt it's the kernel version.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 176951] boot fails unless acpi=off Acer Travelmate X-349

2016-12-21 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=176951

Martin Goyot  changed:

   What|Removed |Added

 CC||mar...@piwany.com

--- Comment #8 from Martin Goyot  ---
I confirm, I have this exact same problem with TravelMate X349-M.

Fedora 25 32-Bit Live system working, but impossible to boot the x64 version,
black frozen screen after boot menu.

I cannot confirm for the BIOS update because of course I don't have windows
anymore so I don't know how I will install it...

-- 
You are receiving this mail because:
You are watching the assignee of the bug.--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 176951] boot fails unless acpi=off Acer Travelmate X-349

2016-12-21 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=176951

--- Comment #9 from Martin Goyot  ---
Can now confirm. Updated Bios, can now run the x64 version.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 176951] boot fails unless acpi=off Acer Travelmate X-349

2016-12-22 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=176951

--- Comment #10 from mus@gmail.com ---
Created attachment 248381
  --> https://bugzilla.kernel.org/attachment.cgi?id=248381&action=edit
Kernel Panic Arch Linux x64 kernel 4.9

Attached is a kernel panic I get with Arch Linux x64 and kernel 4.9.
Unfortunately I can't get the whole stack trace. I tried to up the resolution
with the vga= parameter but then the screen stays black. I actually got it to
boot once, but had no other success after 10 other tries.

The stacktrace looks very similiar to Bug 58201 (comment 20 on this bug also
mentions the same problem with vga=).

Since Fedora and Ubuntu are fixed by the BIOS upgrade but Arch Linux still
doesn't boot, I guess this may be related to different kernel configs? Any
hints on which configs could affect this?

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 176951] boot fails unless acpi=off Acer Travelmate X-349

2016-12-25 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=176951

--- Comment #11 from Zhang Rui  ---
please check if the latest upstream kernel works for you or not.
If yes, I will close this bug as the original bug has been fixed by BIOS
upgrade, and the arch linux 4.9 kernel issue sounds like a Distro problem to
me.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 176951] boot fails unless acpi=off Acer Travelmate X-349

2016-12-25 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=176951

--- Comment #12 from Zhang Rui  ---
(In reply to Zhang Rui from comment #11)
> please check if the latest upstream kernel works for you or not.
> If yes, I will close this bug as the original bug has been fixed by BIOS
> upgrade, and the arch linux 4.9 kernel issue sounds like a Distro problem to
> me.
If no, please also check upstream 4.8 kernel to see if this is a upstream
kernel regression and use git bisect to find out the offending commit.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Bug 191171] New: megaraid_sas fails to recognize RAID on startup, worked with version 4.1.20

2016-12-26 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=191171

Bug ID: 191171
   Summary: megaraid_sas fails to recognize RAID on startup,
worked with version 4.1.20
   Product: SCSI Drivers
   Version: 2.5
Kernel Version: 4.1.36
  Hardware: x86-64
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Other
  Assignee: scsi_drivers-ot...@kernel-bugs.osdl.org
  Reporter: blan...@worldcom.ch
Regression: No

Created attachment 248561
  --> https://bugzilla.kernel.org/attachment.cgi?id=248561&action=edit
Boot with kernel 4.1.36 which fails to see RAID

After a kernel update made by openSuSE Leap 42.1 on-line update and a reboot,
the kernel didn't find anymore the RAID.
Rebooting with previous kernel (4.1.20) fixed the problem.
See attached journalctl logs (fail & working)

Hardware environment:
- Intel Xeon E5-2650 v2
- Intel C602 Patsburg-A chipset
- 128 GB RAM
- LSI MegaRAID 9261-8i SAS-2 RAID Controller, PCIe 2.0 (512Mb RAM)

OS:
- openSuSE Leap 42.1

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1266 matches

Mail list logo