[Resend with Ack][PATCH 8/9] scsi/pm8001: use pdev->pm_cap instead of pci_find_capability(..,PCI_CAP_ID_PM)

2013-06-27 Thread Yijing Wang
Pci core has been saved pm cap register offset by pdev->pm_cap in pci_pm_init()
in init path. So we can use pdev->pm_cap instead of using
pci_find_capability(pdev, PCI_CAP_ID_PM) for better performance and simplified 
code.

Acked-by: Lindar Liu 
Tested-by: Lindar Liu 
Signed-off-by: Yijing Wang 
Cc: xjtu...@gmail.com
Cc: lindar_...@usish.com
Cc: "James E.J. Bottomley" 
Cc: linux-scsi@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
---
 drivers/scsi/pm8001/pm8001_init.c |7 +++
 1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/pm8001/pm8001_init.c 
b/drivers/scsi/pm8001/pm8001_init.c
index e4b9bc7..3861aa1 100644
--- a/drivers/scsi/pm8001/pm8001_init.c
+++ b/drivers/scsi/pm8001/pm8001_init.c
@@ -912,14 +912,13 @@ static int pm8001_pci_suspend(struct pci_dev *pdev, 
pm_message_t state)
 {
struct sas_ha_struct *sha = pci_get_drvdata(pdev);
struct pm8001_hba_info *pm8001_ha;
-   int i , pos;
+   int i;
u32 device_state;
pm8001_ha = sha->lldd_ha;
flush_workqueue(pm8001_wq);
scsi_block_requests(pm8001_ha->shost);
-   pos = pci_find_capability(pdev, PCI_CAP_ID_PM);
-   if (pos == 0) {
-   printk(KERN_ERR " PCI PM not supported\n");
+   if (!pdev->pm_cap) {
+   dev_err(&pdev->dev, " PCI PM not supported\n");
return -ENODEV;
}
PM8001_CHIP_DISP->interrupt_disable(pm8001_ha, 0xFF);
-- 
1.7.1


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi_prep_fn() check for empty queue

2013-06-27 Thread Maxim Uvarov

On 06/26/2013 01:20 PM, Bart Van Assche wrote:

On 06/26/13 11:02, Maxim Uvarov wrote:

This fix:
end_request: I/O error, dev sdc, sector 976576
  rport-0:0-3: blocked FC remote port time out: removing target and 
saving

binding
BUG: unable to handle kernel NULL pointer dereference at 
0400

IP: [] scsi_prep_state_check+0xe/0x99
  [] scsi_setup_blk_pc_cmnd+0x1b/0x115
  [] scsi_prep_fn+0x29/0x3b
  [] blk_peek_request+0xe1/0x1b3
  [] scsi_request_fn+0x3a/0x4d2
  [] __generic_unplug_device+0x32/0x36
  [] blk_execute_rq_nowait+0x77/0x9e
  [] blk_execute_rq+0xa6/0xde
  [] ? printk+0x41/0x46
  [] ? get_rdac_req+0x81/0xe8 [scsi_dh_rdac]
  [] send_mode_select+0x29f/0x489 [scsi_dh_rdac]
  [] ? probe_workqueue_execution+0xb1/0xce
  [] worker_thread+0x1a9/0x237
  [] ? send_mode_select+0x0/0x489 [scsi_dh_rdac]
  [] ? autoremove_wake_function+0x0/0x39
  [] ? worker_thread+0x0/0x237
  [] kthread+0x7f/0x87
  [] child_rip+0xa/0x20
  [] ? kthread+0x0/0x87
  [] ? child_rip+0x0/0x20
Signed-off-by: Maxim Uvarov 
---
  drivers/scsi/scsi_lib.c |3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 86d5220..8e89ed9 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1295,6 +1295,9 @@ int scsi_prep_fn(struct request_queue *q, 
struct request *req)

  struct scsi_device *sdev = q->queuedata;
  int ret = BLKPREP_KILL;

+if (!sdev)
+return ret;
+
  if (req->cmd_type == REQ_TYPE_BLOCK_PC)
  ret = scsi_setup_blk_pc_cmnd(sdev, req);
  return scsi_prep_return(q, req, ret);


Sorry but this patch does not look like a proper fix to me. What you 
probably need is a scsi_device_get() call in scsi_dh_rdac.c somewhere 
before the queue_work(kmpath_rdacd, &ctlr->ms_work) call and a 
scsi_device_put() call once send_mode_select() has finished using the 
sdev.


Bart.

Thanks Bart. It looks reasonable. I will do some testing for your solution.

Maxim.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] Limit overall SCSI EH runtime

2013-06-27 Thread Ren Mingxin

Hi, Hannes & James:

On 06/10/2013 07:11 PM, Hannes Reinecke wrote:

This patchset implements a new 'eh_deadline' attribute to the
SCSI host. It will limit the overall SCSI EH runtime by a given
timeout. If the timeout expires all intermediate steps will
be skipped and host reset will be scheduled immediately.


First of all, I think this patchset is useful to restrict some actual
interminable EH processes. BTW: There were some patches which tried
to add user interface to customize different hardware reset levels to
shorten the EH duration, and unfortunately they were not accepted.
Your hard work on EH improvement is much appreciated:-)

But please let me know yourserialEH improvement jobs - will the
redundant environment(such as multipath, mirroring) be taken into
account specially? To this patchset, will just giving a appropriate
timeout to skip EH except for host reset is enough for a quick fail
over in redundant systems? In other words, do you think reserving the
host reset which will occupy the longest time in the escalated levels
is acceptable with redundant configuration?

Thanks,
Ren
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] scsi: ufs: Fix broken task management command implementation

2013-06-27 Thread Santosh Y
On Thu, Jun 13, 2013 at 8:00 PM, Sujit Reddy Thumma
 wrote:
> Currently, sending Task Management (TM) command to the card might
> be broken in some scenarios as listed below:
>
> - If there are more than 8 TM commands the implementation returns
> error to the caller.
> Fix: Wait for one of the slots to be emptied and send the command.
>
> - Sometimes it is necessary for the caller to know the TM service
> response code to determine the task status.
> Fix: Propogate the service response to the caller.
>
> - If the TM command times out no proper error recovery is implemented.
> Fix: Clear the command in the controller door-bell register, so that
> further commands for the same slot don't fail.
>
> - While preparing the TM command descriptor the tag used should be the
> free slot of TM command and not the task tag of the command which
> the TM command is trying to manage.
> Fix: Use free slot tag instead of task tag of SCSI command
>
> - Since the TM command involves H/W communication, abruptly ending the
> request on kill interrupt signal might cause h/w malfunction.
> Fix: Wait for hardware completion interrupt with TASK_UNINTERRUPTIBLE set.
>
> Signed-off-by: Sujit Reddy Thumma 
> ---
>  drivers/scsi/ufs/ufshcd.c |  174 
> +
>  drivers/scsi/ufs/ufshcd.h |6 +-
>  2 files changed, 117 insertions(+), 63 deletions(-)
>
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
> index aa1f11e..7d043f4 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -53,6 +53,9 @@
>  /* Query request timeout */
>  #define QUERY_REQ_TIMEOUT 30 /* msec */
>
> +/* Task management command timeout */
> +#define TM_CMD_TIMEOUT 100 /* msecs */
> +
>  #define SCSI_CMD_QUEUE_SIZE (hba->nutrs - 1)
>  /* Reserved tag for internal commands */
>  #define INTERNAL_CMD_TAG   (hba->nutrs - 1)
> @@ -188,13 +191,32 @@ ufshcd_get_tmr_ocs(struct utp_task_req_desc 
> *task_req_descp)
>  /**
>   * ufshcd_get_tm_free_slot - get a free slot for task management request
>   * @hba: per adapter instance
> + * @free_slot: pointer to variable with available slot value
>   *
> - * Returns maximum number of task management request slots in case of
> - * task management queue full or returns the free slot number
> + * Get a free tag and lock it until ufshcd_put_tm_slot() is called.
> + * Returns 0 if free slot is not available, else return 1 with tag value
> + * in @free_slot.
>   */
> -static inline int ufshcd_get_tm_free_slot(struct ufs_hba *hba)
> +static bool ufshcd_get_tm_free_slot(struct ufs_hba *hba, int *free_slot)
> +{
> +   int tag;
> +
> +   do {
> +   tag = find_first_zero_bit(
> +   &hba->outstanding_tasks, hba->nutmrs);
> +   if (tag >= hba->nutmrs)
> +   return false;
> +   } while (test_and_set_bit_lock(tag, &hba->outstanding_tasks));
> +
> +   if (free_slot)
> +   *free_slot = tag;
> +
> +   return true;
> +}
> +
> +static inline void ufshcd_put_tm_slot(struct ufs_hba *hba, int slot)
>  {
> -   return find_first_zero_bit(&hba->outstanding_tasks, hba->nutmrs);
> +   clear_bit_unlock(slot, &hba->outstanding_tasks);
>  }
>
>  /**
> @@ -1745,10 +1767,11 @@ static void ufshcd_slave_destroy(struct scsi_device 
> *sdev)
>   * ufshcd_task_req_compl - handle task management request completion
>   * @hba: per adapter instance
>   * @index: index of the completed request
> + * @resp: task management service response
>   *
> - * Returns SUCCESS/FAILED
> + * Returns non-zero value on error, zero on success
>   */
> -static int ufshcd_task_req_compl(struct ufs_hba *hba, u32 index)
> +static int ufshcd_task_req_compl(struct ufs_hba *hba, u32 index, u8 *resp)
>  {
> struct utp_task_req_desc *task_req_descp;
> struct utp_upiu_task_rsp *task_rsp_upiup;
> @@ -1758,9 +1781,6 @@ static int ufshcd_task_req_compl(struct ufs_hba *hba, 
> u32 index)
>
> spin_lock_irqsave(hba->host->host_lock, flags);
>
> -   /* Clear completed tasks from outstanding_tasks */
> -   __clear_bit(index, &hba->outstanding_tasks);
> -
> task_req_descp = hba->utmrdl_base_addr;
> ocs_value = ufshcd_get_tmr_ocs(&task_req_descp[index]);
>
> @@ -1769,19 +1789,15 @@ static int ufshcd_task_req_compl(struct ufs_hba *hba, 
> u32 index)
> task_req_descp[index].task_rsp_upiu;
> task_result = be32_to_cpu(task_rsp_upiup->header.dword_1);
> task_result = ((task_result & MASK_TASK_RESPONSE) >> 8);
> -
> -   if (task_result != UPIU_TASK_MANAGEMENT_FUNC_COMPL &&
> -   task_result != UPIU_TASK_MANAGEMENT_FUNC_SUCCEEDED)
> -   task_result = FAILED;
> -   else
> -   task_result = SUCCESS;
> +   if (resp)
> +   *resp = (u8)task_result;
> } else {
> -   task_result = FAILED

RE: [PATCH 3/5] Drivers: scsi: storvsc: Implement multi-channel support

2013-06-27 Thread KY Srinivasan


> -Original Message-
> From: James Bottomley [mailto:jbottom...@parallels.com]
> Sent: Wednesday, June 26, 2013 11:22 PM
> To: KY Srinivasan
> Cc: gre...@linuxfoundation.org; linux-ker...@vger.kernel.org;
> de...@linuxdriverproject.org; oher...@suse.com; h...@infradead.org; linux-
> s...@vger.kernel.org
> Subject: Re: [PATCH 3/5] Drivers: scsi: storvsc: Implement multi-channel 
> support
> 
> On Tue, 2013-06-04 at 12:05 -0700, K. Y. Srinivasan wrote:
> > Implement multi-channel support for the storage devices.
> 
> This doesn't compile:
> 
>   CC [M]  drivers/scsi/storvsc_drv.o
> drivers/scsi/storvsc_drv.c: In function ‘handle_sc_creation’:
> drivers/scsi/storvsc_drv.c:763:35: error: ‘struct vmbus_channel’ has no
> member named ‘primary_channel’
> drivers/scsi/storvsc_drv.c: In function ‘handle_multichannel_storage’:
> drivers/scsi/storvsc_drv.c:805:2: error: implicit declaration of
> function
> ‘vmbus_set_sc_create_callback’ [-Werror=implicit-function-declaration]
> drivers/scsi/storvsc_drv.c:812:2: error: implicit declaration of
> function
> ‘vmbus_are_subchannels_present’ [-Werror=implicit-function-declaration]
> drivers/scsi/storvsc_drv.c: In function ‘storvsc_on_channel_callback’:
> drivers/scsi/storvsc_drv.c:1223:13: error: ‘struct vmbus_channel’ has no
> member named ‘primary_channel’
> drivers/scsi/storvsc_drv.c:1224:19: error: ‘struct vmbus_channel’ has no
> member named ‘primary_channel’
> drivers/scsi/storvsc_drv.c: In function ‘storvsc_do_io’:
> drivers/scsi/storvsc_drv.c:1341:2: error: implicit declaration of
> function
> ‘vmbus_get_outgoing_channel’ [-Werror=implicit-function-declaration]
> drivers/scsi/storvsc_drv.c:1341:19: warning: assignment makes pointer
> from integer without a cast [enabled by default]
> cc1: some warnings being treated as errors
> make[2]: *** [drivers/scsi/storvsc_drv.o] Error 1
> 
> I assume this is a cross tree dependency?  What's the relevant branch I
> need?

You are right; Greg checked in the relevant patch sometime back:

git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-next branch.


Regards,

K. Y

> 
> James

N�r��yb�X��ǧv�^�)޺{.n�+{���"�{ay�ʇڙ�,j��f���h���z��w���
���j:+v���w�j�mzZ+�ݢj"��!�i

Re: [PATCH 7/7] scsi: Add 'eh_deadline' to limit SCSI EH runtime

2013-06-27 Thread Ewan Milne
The eh_deadline changes allow for a significant improvement
in multipath failover time.  It works very well in our testing.
I do have a few corrections, see below:

On Mon, 2013-06-10 at 13:11 +0200, Hannes Reinecke wrote:
> This patchs adds an 'eh_deadline' attribute to the scsi
> host which limits the overall runtime of the SCSI EH.
> When a command is failed the start time of the EH is stored
> in 'last_reset'. If the overall runtime of the SCSI EH is longer
> than last_reset + eh_deadline, the EH is short-circuited and
> falls through to issue a host reset only.
> 
> Signed-off-by: Hannes Reinecke 
> ---
>  drivers/scsi/hosts.c  |   7 +++
>  drivers/scsi/scsi_error.c | 142 
> +++---
>  drivers/scsi/scsi_sysfs.c |  37 
>  include/scsi/scsi_host.h  |   2 +-
>  4 files changed, 180 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> index df0c3c7..c8d828f 100644
> --- a/drivers/scsi/hosts.c
> +++ b/drivers/scsi/hosts.c
> @@ -316,6 +316,12 @@ static void scsi_host_dev_release(struct device *dev)
>   kfree(shost);
>  }
>  
> +static unsigned int shost_eh_deadline;
> +
> +module_param_named(eh_deadline, shost_eh_deadline, uint, S_IRUGO|S_IWUSR);
> +MODULE_PARM_DESC(eh_deadline,
> +  "SCSI EH deadline in seconds (should be between 1 and 
> 2^32-1)");
> +
>  static struct device_type scsi_host_type = {
>   .name = "scsi_host",
>   .release =  scsi_host_dev_release,
> @@ -388,6 +394,7 @@ struct Scsi_Host *scsi_host_alloc(struct 
> scsi_host_template *sht, int privsize)
>   shost->unchecked_isa_dma = sht->unchecked_isa_dma;
>   shost->use_clustering = sht->use_clustering;
>   shost->ordered_tag = sht->ordered_tag;
> + shost->eh_deadline = shost_eh_deadline;

This should be shost->eh_deadline = shost_eh_deadline * HZ; since the
parameter is specified in seconds.

>  
>   if (sht->supported_mode == MODE_UNKNOWN)
>   /* means we didn't set it ... default to INITIATOR */
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 467cb3c..cf30475 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -91,6 +91,31 @@ void scsi_schedule_eh(struct Scsi_Host *shost)
>  }
>  EXPORT_SYMBOL_GPL(scsi_schedule_eh);
>  
> +static int sdev_eh_deadline(struct Scsi_Host *shost,
> +unsigned long eh_start)
> +{
> + if (!shost->eh_deadline)
> + return 0;
> +
> + if (shost->last_reset != 0 &&
> + time_before(shost->last_reset, eh_start))
> + eh_start = shost->last_reset;
> +
> + if (time_before(jiffies,
> + eh_start + shost->eh_deadline))
> + return 0;
> +
> + return 1;
> +}
> +
> +static int scsi_host_eh_deadline(struct Scsi_Host *shost)
> +{
> + if (!shost->last_reset)
> + return 0;
> +
> + return sdev_eh_deadline(shost, shost->last_reset);
> +}
> +
>  /**
>   * scsi_eh_abort_handler - Handle command aborts
>   * @work:sdev on which commands should be aborted.
> @@ -102,13 +127,15 @@ scsi_eh_abort_handler(struct work_struct *work)
>   container_of(work, struct scsi_device, abort_work);
>   struct scsi_cmnd *scmd, *tmp;
>   LIST_HEAD(abort_list);
> - unsigned long flags;
> + unsigned long flags, eh_start;
>   int rtn;
>  
>   spin_lock_irqsave(&sdev->list_lock, flags);
>   list_splice_init(&sdev->eh_abort_list, &abort_list);
>   spin_unlock_irqrestore(&sdev->list_lock, flags);
>  
> + eh_start = jiffies;
> +
>   list_for_each_entry_safe(scmd, tmp, &abort_list, eh_entry) {
>   list_del_init(&scmd->eh_entry);
>   if (sdev->sdev_state == SDEV_CANCEL) {
> @@ -119,6 +146,13 @@ scsi_eh_abort_handler(struct work_struct *work)
>   scsi_finish_command(scmd);
>   continue;
>   }
> + if (sdev_eh_deadline(sdev->host, eh_start)) {
> + SCSI_LOG_ERROR_RECOVERY(3,
> + scmd_printk(KERN_INFO, scmd,
> +  "eh timeout, not aborting\n"));
> + list_move_tail(&scmd->eh_entry, &abort_list);
> + goto start_eh;
> + }
>   SCSI_LOG_ERROR_RECOVERY(3,
>   scmd_printk(KERN_INFO, scmd,
>   "aborting command %p\n", scmd));
> @@ -151,6 +185,12 @@ scsi_eh_abort_handler(struct work_struct *work)
>   return;
>  
>  start_eh:
> + spin_lock_irqsave(sdev->host->host_lock, flags);
> + if (sdev->host->eh_deadline &&
> + (!sdev->host->last_reset ||
> +  time_before(eh_start, sdev->host->last_reset)))
> + sdev->host->last_reset = eh_start;
> + spin_unlock_irqrestore(sdev->host->host_lock, flags);
>   list_for_each_entry_safe(scmd, tmp, &abort_list, eh_en

[PATCH v12 0/6] SCSI device removal fixes

2013-06-27 Thread Bart Van Assche

Fix a few issues related to SCSI device removal:
- Fix a race between starved list processing and device removal that
  can trigger a kernel oops.
- Avoid that __scsi_remove_device() is called twice for the same SCSI
  device, which also can cause a kernel oops.
- Restrict the SCSI device state changes allowed via sysfs.
- Avoid that invoking scsi_device_set_state() triggers a race.
- Avoid re-enabling I/O after the transport layer became offline.

Changes compared to v11:
- Left out a patch that was not a device removal bug fix.
- Left out the patches about which there is not yet an agreement.

Changes compared to v10:
- Rebased and retested on top of Linux kernel v3.10-rc5.

Changes compared to v9:
- Changed one WARN_ON() statement into a WARN() statement.

Changes compared to v8:
- Addressed the feedback from Joe Lawrence - dropped the patch that
  makes scsi_remove_host() wait until the last sdev user is gone.
- Eliminated Scsi_Host.tmf_in_progress since it duplicates state
  information available in Scsi_Host.eh_active.
- Added a patch to avoid reenabling I/O after the transport layer
  became offline.

Changes compared to v7:
- Addressed the review comments posted by Hannes Reinecke and Rolf Eike
  Beer.
- Modified patch "Make scsi_remove_host() wait until error handling
  finished" such that it is also safe for SCSI timeout values below
  the maximum LLD response time by modifying scsi_send_eh_cmnd() such
  that it does not invoke any LLD code after scsi_remove_host() started.
- Added a patch to save and restore the host_scribble field.
- Refined / clarified several patch descriptions.
- Rebased and retested on top of kernel v3.8-rc6.

Changes compared to v6:
- Dropped the first six patches since Jens queued these for 3.8.
- Added patch to avoid that __scsi_remove_device() is invoked twice.
- Restore error recovery in the SHOST_CANCEL state.

Changes compared to v5:
- Avoid that block layer work can be scheduled on a dead queue.
- Do not invoke any SCSI LLD callback after scsi_remove_host() finished.
- Stop error handling as soon as scsi_remove_host() started.
- Remove the unused function bsg_goose_queue().
- Avoid that scsi_device_set_state() triggers a race condition.

Changes compared to v4:
- Moved queue_flag_set(QUEUE_FLAG_DEAD, q) from blk_drain_queue() into
  blk_cleanup_queue().
- Declared the new __blk_run_queue_uncond() function inline. Checked in
  the generated assembler code that this function is really inlined in
  __blk_run_queue().
- Elaborated several patch descriptions.
- Added sparse annotations to scsi_request_fn().
- Split several patches.

Changes compared to v3:
- Fixed a race condition by setting QUEUE_FLAG_DEAD earlier.
- Added a patch for fixing a race between starved list processing
  and device removal to this series.

Changes compared to v2:
- Split second patch into two patches.
- Refined patch descriptions.

Changes compared to v1:
- Included a patch to rename QUEUE_FLAG_DEAD.
- Refined the descriptions of the __blk_run_queue_uncond() and
  blk_cleanup_queue() functions.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v12 1/6] Fix race between starved list and device removal

2013-06-27 Thread Bart Van Assche
From: James Bottomley 

scsi_run_queue() examines all SCSI devices that are present on
the starved list. Since scsi_run_queue() unlocks the SCSI host
lock a SCSI device can get removed after it has been removed
from the starved list and before its queue is run. Protect
against that race condition by holding a reference on the
queue while running it.

Signed-off-by: James Bottomley 
Signed-off-by: Bart Van Assche 
Reported-by: Chanho Min 
Reference: http://lkml.org/lkml/2012/8/2/96
Cc: Tejun Heo 
Cc: Mike Christie 
Cc: Hannes Reinecke 
Cc: 
---
 drivers/scsi/scsi_lib.c |   26 +-
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 86d5220..df8bd5a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -434,6 +434,8 @@ static void scsi_run_queue(struct request_queue *q)
list_splice_init(&shost->starved_list, &starved_list);
 
while (!list_empty(&starved_list)) {
+   struct request_queue *slq;
+
/*
 * As long as shost is accepting commands and we have
 * starved queues, call blk_run_queue. scsi_request_fn
@@ -456,11 +458,25 @@ static void scsi_run_queue(struct request_queue *q)
continue;
}
 
-   spin_unlock(shost->host_lock);
-   spin_lock(sdev->request_queue->queue_lock);
-   __blk_run_queue(sdev->request_queue);
-   spin_unlock(sdev->request_queue->queue_lock);
-   spin_lock(shost->host_lock);
+   /*
+* Once we drop the host lock, a racing scsi_remove_device()
+* call may remove the sdev from the starved list and destroy
+* it and the queue.  Mitigate by taking a reference to the
+* queue and never touching the sdev again after we drop the
+* host lock.  Note: if __scsi_remove_device() invokes
+* blk_cleanup_queue() before the queue is run from this
+* function then blk_run_queue() will return immediately since
+* blk_cleanup_queue() marks the queue with QUEUE_FLAG_DYING.
+*/
+   slq = sdev->request_queue;
+   if (!blk_get_queue(slq))
+   continue;
+   spin_unlock_irqrestore(shost->host_lock, flags);
+
+   blk_run_queue(slq);
+   blk_put_queue(slq);
+
+   spin_lock_irqsave(shost->host_lock, flags);
}
/* put any unprocessed entries back */
list_splice(&starved_list, &shost->starved_list);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v12 2/6] Avoid calling __scsi_remove_device() twice

2013-06-27 Thread Bart Van Assche
If something goes wrong during LUN scanning, e.g. a transport layer
failure occurs, then __scsi_remove_device() can get invoked by the
LUN scanning code for a SCSI device in state SDEV_CREATED_BLOCK. If
this happens then the SCSI device has not yet been added to sysfs
(is_visible == 0).  Make sure that in that case the transition into
state SDEV_DEL occurs. This avoids that __scsi_remove_device() gets
invoked a second time by scsi_forget_host().

Signed-off-by: Bart Van Assche 
Cc: James Bottomley 
Cc: Mike Christie 
Cc: Hannes Reinecke 
Cc: Tejun Heo 
---
 drivers/scsi/scsi_lib.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index df8bd5a..124392f 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2193,6 +2193,7 @@ scsi_device_set_state(struct scsi_device *sdev, enum 
scsi_device_state state)
case SDEV_OFFLINE:
case SDEV_TRANSPORT_OFFLINE:
case SDEV_CANCEL:
+   case SDEV_CREATED_BLOCK:
break;
default:
goto illegal;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v12 3/6] Restrict device state changes allowed via sysfs

2013-06-27 Thread Bart Van Assche
Restrict the SCSI device state changes allowd via sysfs to the
OFFLINE<>RUNNING transitions. Other transitions may confuse
the SCSI mid-layer. As an example, changing the state of a SCSI
device via sysfs into "cancel" or "deleted" prevents removal of
a SCSI device by scsi_remove_host().

Signed-off-by: Bart Van Assche 
Cc: Tejun Heo 
Cc: James Bottomley 
Cc: Mike Christie 
Cc: Hannes Reinecke 
Cc: David Milburn 
---
 drivers/scsi/scsi_sysfs.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 931a7d9..013c6de 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -605,7 +605,7 @@ store_state_field(struct device *dev, struct 
device_attribute *attr,
break;
}
}
-   if (!state)
+   if (state != SDEV_OFFLINE && state != SDEV_RUNNING)
return -EINVAL;
 
if (scsi_device_set_state(sdev, state))
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v12 4/6] Avoid saving/restoring interrupt state inside scsi_remove_host()

2013-06-27 Thread Bart Van Assche
Since it is not allowed to invoke scsi_remove_host() with interrupts
disabled, avoid saving and restoring the interrupt state inside
scsi_remove_host(). This patch does not change the functionality of
the function scsi_remove_host().

Signed-off-by: Bart Van Assche 
Acked-by: Tejun Heo 
Acked-by: Hannes Reinecke 
Reviewed-by: Mike Christie 
Cc: James Bottomley 
---
 drivers/scsi/hosts.c |   12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
index df0c3c7..034a567 100644
--- a/drivers/scsi/hosts.c
+++ b/drivers/scsi/hosts.c
@@ -156,27 +156,25 @@ EXPORT_SYMBOL(scsi_host_set_state);
  **/
 void scsi_remove_host(struct Scsi_Host *shost)
 {
-   unsigned long flags;
-
mutex_lock(&shost->scan_mutex);
-   spin_lock_irqsave(shost->host_lock, flags);
+   spin_lock_irq(shost->host_lock);
if (scsi_host_set_state(shost, SHOST_CANCEL))
if (scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY)) {
-   spin_unlock_irqrestore(shost->host_lock, flags);
+   spin_unlock_irq(shost->host_lock);
mutex_unlock(&shost->scan_mutex);
return;
}
-   spin_unlock_irqrestore(shost->host_lock, flags);
+   spin_unlock_irq(shost->host_lock);
 
scsi_autopm_get_host(shost);
scsi_forget_host(shost);
mutex_unlock(&shost->scan_mutex);
scsi_proc_host_rm(shost);
 
-   spin_lock_irqsave(shost->host_lock, flags);
+   spin_lock_irq(shost->host_lock);
if (scsi_host_set_state(shost, SHOST_DEL))
BUG_ON(scsi_host_set_state(shost, SHOST_DEL_RECOVERY));
-   spin_unlock_irqrestore(shost->host_lock, flags);
+   spin_unlock_irq(shost->host_lock);
 
transport_unregister_device(&shost->shost_gendev);
device_unregister(&shost->shost_dev);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v12 5/6] Avoid that scsi_device_set_state() triggers a race

2013-06-27 Thread Bart Van Assche
Make concurrent invocations of scsi_device_set_state() safe.

Signed-off-by: Bart Van Assche 
Acked-by: Hannes Reinecke 
Cc: James Bottomley 
Cc: Tejun Heo 
Cc: Mike Christie 
---
 drivers/scsi/scsi_error.c |4 
 drivers/scsi/scsi_lib.c   |   43 ++-
 drivers/scsi/scsi_scan.c  |   15 ---
 drivers/scsi/scsi_sysfs.c |   24 +++-
 4 files changed, 65 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index f43de1e..7006359 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1380,7 +1380,11 @@ static void scsi_eh_offline_sdevs(struct list_head 
*work_q,
list_for_each_entry_safe(scmd, next, work_q, eh_entry) {
sdev_printk(KERN_INFO, scmd->device, "Device offlined - "
"not ready after error recovery\n");
+
+   spin_lock_irq(scmd->device->host->host_lock);
scsi_device_set_state(scmd->device, SDEV_OFFLINE);
+   spin_unlock_irq(scmd->device->host->host_lock);
+
if (scmd->eh_eflags & SCSI_EH_CANCEL_CMD) {
/*
 * FIXME: Handle lost cmds.
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 124392f..6a4fde7 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2096,7 +2096,9 @@ EXPORT_SYMBOL(scsi_test_unit_ready);
  * @state: state to change to.
  *
  * Returns zero if unsuccessful or an error if the requested 
- * transition is illegal.
+ * transition is illegal. It is the responsibility of the caller to make
+ *  sure that a call of this function does not race with other code that
+ *  accesses the device state, e.g. by holding the host lock.
  */
 int
 scsi_device_set_state(struct scsi_device *sdev, enum scsi_device_state state)
@@ -2374,7 +2376,13 @@ EXPORT_SYMBOL_GPL(sdev_evt_send_simple);
 int
 scsi_device_quiesce(struct scsi_device *sdev)
 {
-   int err = scsi_device_set_state(sdev, SDEV_QUIESCE);
+   struct Scsi_Host *host = sdev->host;
+   int err;
+
+   spin_lock_irq(host->host_lock);
+   err = scsi_device_set_state(sdev, SDEV_QUIESCE);
+   spin_unlock_irq(host->host_lock);
+
if (err)
return err;
 
@@ -2398,13 +2406,21 @@ EXPORT_SYMBOL(scsi_device_quiesce);
  */
 void scsi_device_resume(struct scsi_device *sdev)
 {
+   struct Scsi_Host *host = sdev->host;
+   int err;
+
/* check if the device state was mutated prior to resume, and if
 * so assume the state is being managed elsewhere (for example
 * device deleted during suspend)
 */
-   if (sdev->sdev_state != SDEV_QUIESCE ||
-   scsi_device_set_state(sdev, SDEV_RUNNING))
+   spin_lock_irq(host->host_lock);
+   err = sdev->sdev_state == SDEV_QUIESCE ?
+   scsi_device_set_state(sdev, SDEV_RUNNING) : -EINVAL;
+   spin_unlock_irq(host->host_lock);
+
+   if (err)
return;
+
scsi_run_queue(sdev->request_queue);
 }
 EXPORT_SYMBOL(scsi_device_resume);
@@ -2454,17 +2470,19 @@ EXPORT_SYMBOL(scsi_target_resume);
 int
 scsi_internal_device_block(struct scsi_device *sdev)
 {
+   struct Scsi_Host *host = sdev->host;
struct request_queue *q = sdev->request_queue;
unsigned long flags;
int err = 0;
 
+   spin_lock_irqsave(host->host_lock, flags);
err = scsi_device_set_state(sdev, SDEV_BLOCK);
-   if (err) {
+   if (err)
err = scsi_device_set_state(sdev, SDEV_CREATED_BLOCK);
+   spin_unlock_irqrestore(host->host_lock, flags);
 
-   if (err)
-   return err;
-   }
+   if (err)
+   return err;
 
/* 
 * The device has transitioned to SDEV_BLOCK.  Stop the
@@ -2499,13 +2517,16 @@ int
 scsi_internal_device_unblock(struct scsi_device *sdev,
 enum scsi_device_state new_state)
 {
+   struct Scsi_Host *host = sdev->host;
struct request_queue *q = sdev->request_queue; 
unsigned long flags;
+   int ret = 0;
 
/*
 * Try to transition the scsi device to SDEV_RUNNING or one of the
 * offlined states and goose the device queue if successful.
 */
+   spin_lock_irqsave(host->host_lock, flags);
if ((sdev->sdev_state == SDEV_BLOCK) ||
(sdev->sdev_state == SDEV_TRANSPORT_OFFLINE))
sdev->sdev_state = new_state;
@@ -2517,7 +2538,11 @@ scsi_internal_device_unblock(struct scsi_device *sdev,
sdev->sdev_state = SDEV_CREATED;
} else if (sdev->sdev_state != SDEV_CANCEL &&
 sdev->sdev_state != SDEV_OFFLINE)
-   return -EINVAL;
+   ret = -EINVAL;
+   spin_unlock_irqrestore(host->host_lock, flags);
+
+   if (ret)
+   return ret;
 
spin_lock_irqsave(q->

[PATCH v12 6/6] Avoid re-enabling I/O after the transport became offline

2013-06-27 Thread Bart Van Assche
Disallow the SDEV_TRANSPORT_OFFLINE to SDEV_CANCEL transition such
that no I/O is sent to devices for which the transport is offline.
Notes:
- Functions like sd_shutdown() use scsi_execute_req() and hence
  set the REQ_PREEMPT flag. Such requests are passed to the LLD
  queuecommand callback in the SDEV_CANCEL state.
- This patch does not affect Fibre Channel LLD drivers since these
  drivers invoke fc_remote_port_chkready() before submitting a SCSI
  request to the HBA. That prevents a timeout to occur in state
  SDEV_CANCEL if the transport is offline.

Signed-off-by: Bart Van Assche 
Cc: Mike Christie 
Cc: James Bottomley 
Cc: Hannes Reinecke 
Cc: Tejun Heo 
---
 drivers/scsi/scsi_lib.c   |1 -
 drivers/scsi/scsi_sysfs.c |4 +++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 6a4fde7..63875c3 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2180,7 +2180,6 @@ scsi_device_set_state(struct scsi_device *sdev, enum 
scsi_device_state state)
case SDEV_RUNNING:
case SDEV_QUIESCE:
case SDEV_OFFLINE:
-   case SDEV_TRANSPORT_OFFLINE:
case SDEV_BLOCK:
break;
default:
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index dfbaa34..666b741 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -959,14 +959,16 @@ void __scsi_remove_device(struct scsi_device *sdev)
 {
struct Scsi_Host *shost = sdev->host;
struct device *dev = &sdev->sdev_gendev;
+   enum scsi_device_state sdev_state;
int res;
 
if (sdev->is_visible) {
spin_lock_irq(shost->host_lock);
+   sdev_state = sdev->sdev_state;
res = scsi_device_set_state(sdev, SDEV_CANCEL);
spin_unlock_irq(shost->host_lock);
 
-   if (res != 0)
+   if (res != 0 && sdev_state != SDEV_TRANSPORT_OFFLINE)
return;
 
bsg_unregister_queue(sdev->request_queue);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 4/6] [SCSI] Generate uevents for certain Unit Attention codes

2013-06-27 Thread Ewan Milne
On Mon, 2013-06-24 at 14:58 +, James Bottomley wrote:
> On Mon, 2013-06-24 at 10:11 -0400, Ewan Milne wrote:
> > On Wed, 2013-06-19 at 18:48 +, James Bottomley wrote:
> > > On Wed, 2013-06-19 at 13:42 -0400, Ewan D. Milne wrote:
> > > > From: "Ewan D. Milne" 
> > > > 
> > > > Generate a uevent on the scsi_target object when the following
> > > > Unit Attention ASC/ASCQ code is received:
> > > > 
> > > > 3F/0E  REPORTED LUNS DATA HAS CHANGED
> > > > 
> > > > Generate a uevent on the scsi_device object when the following
> > > > Unit Attention ASC/ASCQ codes are received:
> > > > 
> > > > 2A/01  MODE PARAMETERS CHANGED
> > > > 2A/09  CAPACITY DATA HAS CHANGED
> > > > 38/07  THIN PROVISIONING SOFT THRESHOLD REACHED
> > > > 
> > > > All uevent generation is aggregated and rate-limited so that any
> > > > individual event is delivered no more than once every 2 seconds.
> > > 
> > > Why?  What causes you to think these events would be repeated on a
> > > massive scale.  Mode and Capacity changes are signalled only once per
> > > actual change, which doesn't occur very often.  SBC-3 says that the TP
> > > thresholds are only signalled once but may be signalled again after a
> > > reset.  In general, T10 treats UA as exceptional conditions ... there's
> > > no reason to think they keep repeating.
> > 
> > Well, the concern I had is that since a UA can theoretically be reported
> > on every command, a malfunctioning device could quickly overload udevd.
> 
> We had devices in the 1990s that did this ... I haven't seen any for
> years.  I take it the qualifier "theoretical" means you haven't actually
> seen this behaviour from a current device in the field?

No, you're right, I haven't.  I was just trying to be careful.
If you think it's OK for Mode and Capacity changes to be reported
each time we get a UA, that's fine.

> 
> > I have seen cases where udevd gets significantly behind when processing
> > a flood of events, and didn't want to make that worse.  Kay had concerns
> > about that when Hannes was working on this a while back, I believe.
> > I also didn't want other events to get lost if UA events filled up the
> > NL queue to udevd in userspace.
> 
> The events you're reporting are infrequent in normal operation.  If the
> device goes rogue and floods them, udev issues are likely to be the
> least of our concerns.
> 
> The fact that we may generate a flood because we have a massive number
> of LUNs which each report the infrequent event is a concern, but it
> should be fixed without rate limiting, see below:
> 
> > The other thing that aggregation helps with is when every LUN on a
> > target says REPORTED LUNS DATA HAS CHANGED.  Some storage arrays allow
> > hundreds of LUNS on a target, and I think they will all report the UA
> > if the LUN provisioning to the host is changed.  There is a mode that
> > can be used to suppress this, and only report one UA, but I don't know
> > if all storage arrays support it.  Now, granted, the UAs will only be
> > reported by each LUN when they receive a command, so this could happen
> > at any time in the future, but unfortunately that is the way SCSI works.
> 
> So fixing this problem is what's needed rather than a generic rate limit
> mechanism.  We already have a rudimentary mechanism for suppressing the
> flood of UAs we get on target reset ... reuse the same thing to make
> sure we only get one REPORTED LUNS DATA HAS CHANGED per target.

I looked at doing this, but unfortunately it appears as if it is hard
to know which LUNs are expected to report the REPORTED LUNS DATA HAS
CHANGED if some other LUN has done so.  The difficulty is in the SPC-3
behavior that potentially clears this UA condition on all LUNs that are
accessible on an I_T nexus when a REPORT LUNS command is received.

So, some, but perhaps not all, of the LUNs would report the UA.

Since there isn't a good way to know when a REPORT LUNS command enters
the enabled task state on the device server (as opposed to when it is
sent by the host) relative to when other commands on the various LUNs
are either processed or terminated with the UA, there is the potential
of suppressing a subsequent UA if the LUN inventory changes again.
(The UA should be pending on *some* other LUN(s) in this case, and not
be masked by this logic, but there is no guarantee that those LUNs will
be accessed.)

The suppression of UAs (e.g. 29/00) received following a TARGET RESET
doesn't have this problem, because those UA are only cleared when a
command that is not an INQUIRY, REPORT LUNS, or REQUEST SENSE is
received by each LUN.

I suppose this could be solved by stopping other I/O to the target when
a REPORT LUNS is being issued, but that seems like an invasive change.

Devices conforming to SAM-4 don't have this problem, because they only
return REPORTED LUNS DATA HAS CHANGED on one LUN, but it would be more
useful to make this work with SCSI-3 devices.

> 
> Note there are some fixes required to the current 

Re: fcoe pull request for 3.9-rc

2013-06-27 Thread Love, Robert W
On Wed 26 Jun 2013 11:14:02 PM PDT, James Bottomley wrote:
> On Tue, 2013-06-25 at 20:55 +, Love, Robert W wrote:
>> The following changes since commit 1e876e3b1a9df25bb04682b0d48aaa7e8ae1fc82:
>>
>> Merge branch 'for-linus' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux (2013-06-25
>> 09:08:07 -1000)
>>
>> are available in the git repository at:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/rwlove/fcoe.git
>> tags/critical_fix_for_3.9
>
> This is simple and I can review from the attached patch, so I'll take it
> this time. But, please, next time follow proper process.  That means the
> individual patches should go over the SCSI mailing list for review
> *before* they're sent in a pull request.
>

The reason I sent this pull request directly to Linus was that the 
patches I mailed at (2013-05-21) RC2 had not yet seen any attention 
since they were posted to linux-scsi. It is now RC7 and with 5 weeks of 
inactivity on my initial 3.10 patch series I had no faith that this 
critical fix would make it through the SCSI tree before the 3.10 kernel 
was released. FYI, Linus has taken this change already.

I would have preferred to send the patch to linux-scsi for review 
first, but it had been thoroughly reviewed on the FCoE list, it's a 
fairly trivial change and I was a hurry to get it into Linus' inbox 
given his statement about "finding new ways to insult your pets." The 
patch is also part of the pull request, so it was review-able by anyone 
on linux-scsi or linux-kernel.

Thanks, //Rob

PS: I totally botched the title of this pull request, obviously it 
should say "fcoe pull request for 3.10-rc".



Re: RFC: Allow block drivers to poll for I/O instead of sleeping

2013-06-27 Thread Rik van Riel

On 06/23/2013 02:29 PM, Linus Torvalds wrote:


You could try to do that either *in* the idle thread (which would take
the context switch overhead - maybe negating some of the advantages),
or alternatively hook into the scheduler idle logic before actually
doing the switch.

But anything that starts polling when there are other runnable
processes to be done sounds really debatable. Even if it's "only" 5us
or so. There's a lot of real work that could be done in 5us.


Having a hook into the idle code could be useful for
KVM, too.

In certain message passing workloads, we have seen
that the system throughput suffers greatly simply
by having the KVM thread's preempt notifiers save
CPU state when the guest goes idle, and re-load it
when the guest VCPU is activated again.

Avoiding the context switch overhead when nothing
else wants to run, but being immediately preempted
when something does, could be a big win.

Would it make sense to have some hook that says
"I *am* an idle thread right now", as opposed to
"context switch to the idle thread"?

That hook could even run the cpuidle code, and
switch to the real idle task (saving the current
task state), and put the CPU into a C-state, when
the expected sleep time is long...



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: Allow block drivers to poll for I/O instead of sleeping

2013-06-27 Thread Rik van Riel

On 06/20/2013 04:17 PM, Matthew Wilcox wrote:


--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4527,6 +4527,36 @@ long __sched io_schedule_timeout(long timeout)
return ret;
  }

+/*
+ * Wait for an I/O to complete against this backing_dev_info.  If the
+ * task exhausts its timeslice polling for completions, call io_schedule()
+ * anyway.  If a signal comes pending, return so the task can handle it.
+ * If the io_poll returns an error, give up and call io_schedule(), but
+ * swallow the error.  We may miss an I/O completion (eg if the interrupt
+ * handler gets to it first).  Guard against this possibility by returning
+ * if we've been set back to TASK_RUNNING.
+ */
+void io_wait(struct backing_dev_info *bdi)
+{


I would like something a little more generic in the
scheduler code, that could also be used by other
things in the kernel (say, KVM with message passing
workloads).

Maybe something looking a little like this?

void idle_poll(struct idle_poll_info *ipi)

struct idle_poll_info {
int (*idle_poll_func)(void *data);
int (*idle_poll_preempt)(void *data);
void *data;
}

That way the kernel can:

1) mark the current thread as having idle priority,
   allowing the scheduler to preempt it if something
   else wants to run

2) switch to asynchronous mode if something else wants
   to run, or if the average wait for the process is
   so long that it is better to go asynchronous and
   avoid polling

3) poll for completion if nothing else wants to run

Does that make sense?

Did I forget something you need?

Did I forget something KVM could need?

Is this insane? If so, is it too insane? :)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 25/45] [SCSI] fcoe: Use get/put_online_cpus_atomic() to prevent CPU offline

2013-06-27 Thread Srivatsa S. Bhat
Once stop_machine() is gone from the CPU offline path, we won't be able
to depend on disabling preemption to prevent CPUs from going offline
from under us.

Use the get/put_online_cpus_atomic() APIs to prevent CPUs from going
offline, while invoking from atomic context.

Cc: Robert Love 
Cc: "James E.J. Bottomley" 
Cc: de...@open-fcoe.org
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat 
---

 drivers/scsi/fcoe/fcoe.c |7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
index 32ae6c6..eaa390e 100644
--- a/drivers/scsi/fcoe/fcoe.c
+++ b/drivers/scsi/fcoe/fcoe.c
@@ -1484,6 +1484,7 @@ static int fcoe_rcv(struct sk_buff *skb, struct 
net_device *netdev,
 * was originated, otherwise select cpu using rx exchange id
 * or fcoe_select_cpu().
 */
+   get_online_cpus_atomic();
if (ntoh24(fh->fh_f_ctl) & FC_FC_EX_CTX)
cpu = ntohs(fh->fh_ox_id) & fc_cpu_mask;
else {
@@ -1493,8 +1494,10 @@ static int fcoe_rcv(struct sk_buff *skb, struct 
net_device *netdev,
cpu = ntohs(fh->fh_rx_id) & fc_cpu_mask;
}
 
-   if (cpu >= nr_cpu_ids)
+   if (cpu >= nr_cpu_ids) {
+   put_online_cpus_atomic();
goto err;
+   }
 
fps = &per_cpu(fcoe_percpu, cpu);
spin_lock(&fps->fcoe_rx_list.lock);
@@ -1514,6 +1517,7 @@ static int fcoe_rcv(struct sk_buff *skb, struct 
net_device *netdev,
spin_lock(&fps->fcoe_rx_list.lock);
if (!fps->thread) {
spin_unlock(&fps->fcoe_rx_list.lock);
+   put_online_cpus_atomic();
goto err;
}
}
@@ -1535,6 +1539,7 @@ static int fcoe_rcv(struct sk_buff *skb, struct 
net_device *netdev,
if (fps->thread->state == TASK_INTERRUPTIBLE)
wake_up_process(fps->thread);
spin_unlock(&fps->fcoe_rx_list.lock);
+   put_online_cpus_atomic();
 
return 0;
 err:

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


re: [SCSI] cxgb4i: add support for T5 adapter

2013-06-27 Thread Dan Carpenter
Hello Karen Xie,

The patch 3bd3e8bf6250: "[SCSI] cxgb4i: add support for T5 adapter"
from May 29, 2013, leads to the following Sparse warning:

drivers/scsi/cxgbi/cxgb4i/cxgb4i.c:284:29: warning: incorrect type in 
assignment (different base types)
drivers/scsi/cxgbi/cxgb4i/cxgb4i.c:284:29:expected restricted __be64 
[usertype] params
drivers/scsi/cxgbi/cxgb4i/cxgb4i.c:284:29:got restricted __be32 [usertype] 


drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
   271  struct cpl_t5_act_open_req *req =
   272  (struct cpl_t5_act_open_req *)skb->head;
   273  
   274  req = (struct cpl_t5_act_open_req *)skb->head;
   275  
   276  INIT_TP_WR(req, 0);
   277  OPCODE_TID(req) = 
cpu_to_be32(MK_OPCODE_TID(CPL_ACT_OPEN_REQ,
   278  qid_atid));
   279  req->local_port = csk->saddr.sin_port;
   280  req->peer_port = csk->daddr.sin_port;
   281  req->local_ip = csk->saddr.sin_addr.s_addr;
   282  req->peer_ip = csk->daddr.sin_addr.s_addr;
   283  req->opt0 = cpu_to_be64(opt0);
   284  req->params = cpu_to_be32(select_ntuple(csk->cdev, 
csk->l2t));
^^^
For cpl_t5_act_open_req ->params is a __be64.  For other types it is a
__be32 so I suspect this is a copy and paste error.

   285  opt2 |= 1 << 31;
   286  req->opt2 = cpu_to_be32(opt2);

Here is an article on how to use Sparse for endian bugs:
http://lwn.net/Articles/205624/

Also I'm on the topic, I had another concern later in the file:

drivers/scsi/cxgbi/cxgb4i/cxgb4i.c
  1629  rpl = (struct cpl_act_establish *)skb->data;
  1630  opc = rpl->ot.opcode;
  1631  log_debug(1 << CXGBI_DBG_TOE,
  1632  "cdev %p, opcode 0x%x(0x%x,0x%x), skb %p.\n",
  1633   cdev, opc, rpl->ot.opcode_tid, 
ntohl(rpl->ot.opcode_tid), skb);
  1634  if (cxgb4i_cplhandlers[opc])
  1635  cxgb4i_cplhandlers[opc](cdev, skb);
   ^^^
Since "opc" comes from skb->data shouldn't we check that it is less than
ARRAY_SIZE(cxgb4i_cplhandlers) (239 elements)?  Another solution that's
used in drivers/net/ethernet/chelsio/cxgb/cpl5_cmd.h would be to define
the array as having 256 elements.

  1636  else {
  1637  pr_err("No handler for opcode 0x%x.\n", opc);
  1638  __kfree_skb(skb);
  1639  }

regards,
dan carpenter

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/1] scsi/megaraid: minor cut and paste error fixed.

2013-06-27 Thread Saxena, Sumit
Thanks James for pointing this out, there is no harm with this change.

Acked-by: Sumit Saxena 

Sumit

>-Original Message-
>From: James Georgas [mailto:soulpa...@gmail.com]
>Sent: Wednesday, June 26, 2013 11:33 PM
>To: DL-MegaRAID Linux; jbottom...@parallels.com
>Cc: linux-ker...@vger.kernel.org; linux-scsi@vger.kernel.org; James
>Georgas
>Subject: [PATCH 1/1] scsi/megaraid: minor cut and paste error fixed.
>
>This looks like a cut and paste typo to me. Both of the
>megasas_read_fw_status_reg_* functions involved are identical though, so
>there was no bad behaviour. I changed it for consistency and clarity.
>
>Signed-off-by: James Georgas 
>---
> drivers/scsi/megaraid/megaraid_sas_base.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c
>b/drivers/scsi/megaraid/megaraid_sas_base.c
>index 3a9ddae..87591cd 100644
>--- a/drivers/scsi/megaraid/megaraid_sas_base.c
>+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
>@@ -583,7 +583,7 @@ megasas_clear_intr_skinny(struct
>megasas_register_set __iomem *regs)
>   /*
>* Check if it is our interrupt
>*/
>-  if ((megasas_read_fw_status_reg_gen2(regs) & MFI_STATE_MASK) ==
>+  if ((megasas_read_fw_status_reg_skinny(regs) & MFI_STATE_MASK) ==
>   MFI_STATE_FAULT) {
>   mfiStatus = MFI_INTR_FLAG_FIRMWARE_STATE_CHANGE;
>   } else
>--
>1.8.1.5
>


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html