Re: [PATCH v3 4/6] [SCSI] Generate uevents for certain Unit Attention codes
Hi Ewan, some comments inline: On 06/19/2013 07:42 PM, Ewan D. Milne wrote: > From: "Ewan D. Milne" > > Generate a uevent on the scsi_target object when the following > Unit Attention ASC/ASCQ code is received: > > 3F/0E REPORTED LUNS DATA HAS CHANGED > > Generate a uevent on the scsi_device object when the following > Unit Attention ASC/ASCQ codes are received: > > 2A/01 MODE PARAMETERS CHANGED > 2A/09 CAPACITY DATA HAS CHANGED > 38/07 THIN PROVISIONING SOFT THRESHOLD REACHED > > All uevent generation is aggregated and rate-limited so that any > individual event is delivered no more than once every 2 seconds. > > Log kernel messages when the following Unit Attention ASC/ASCQ > codes are received: > > 2A/xx PARAMETERS CHANGED > 3F/03 INQUIRY DATA HAS CHANGED > 3F/xx TARGET OPERATING CONDITIONS HAVE CHANGED > > Also changed the kernel log messages on existing Unit Attention > codes, to remove text that indicates that the conditions were not > handled in any way (since they are now handled in some way) and > reflect the wording used in SPC-4. > > Also log a kernel message when the a scsi device reports a Unit > Attention queue overflow, indicating that status has been lost. > > These changes are only enabled when the kernel config option > CONFIG_SCSI_ENHANCED_UA is set. Otherwise, the behavior is the > same as before, including the existing kernel log messages. > However, the detection of these conditions was moved to be earlier > in scsi_check_sense(), because the code was not always reached. > > Added a new exported function scsi_report_sense() to allow drivers > to report sense data that is not associated with a SCSI command. > > Signed-off-by: Ewan D. Milne > --- > drivers/scsi/scsi_error.c | 187 > - > drivers/scsi/scsi_lib.c| 17 - > drivers/scsi/scsi_priv.h | 2 + > drivers/scsi/scsi_scan.c | 5 ++ > drivers/scsi/scsi_sysfs.c | 3 + > include/scsi/scsi_cmnd.h | 4 + > include/scsi/scsi_device.h | 22 ++ > include/scsi/scsi_eh.h | 5 ++ > 8 files changed, 223 insertions(+), 22 deletions(-) > > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > index d0f71e5..d0b5a26 100644 > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -223,6 +223,86 @@ static inline void scsi_eh_prt_fail_stats(struct > Scsi_Host *shost, > #endif > > /** > + * scsi_report_sense - Examine scsi sense information and log messages for > + * certain conditions, also issue uevents if so configured. > + * @sshdr: sshdr to be examined > + */ > +void scsi_report_sense(struct scsi_device *sdev, struct scsi_sense_hdr > *sshdr) > +{ > +#ifdef CONFIG_SCSI_ENHANCED_UA > + if (sshdr->ua_queue_overflow) > + sdev_printk(KERN_WARNING, sdev, > + "Unit Attention queue overflow"); > +#endif > + > + if (sshdr->sense_key == UNIT_ATTENTION) { > + if (sshdr->asc == 0x3f && sshdr->ascq == 0x03) > + sdev_printk(KERN_WARNING, sdev, > + "Inquiry data has changed"); Why no event for this one? I would think that this event warrants a device rescan, so definitely would want to be informed here ... > + else if (sshdr->asc == 0x3f && sshdr->ascq == 0x0e) { > +#ifdef CONFIG_SCSI_ENHANCED_UA > + struct scsi_target *starget = scsi_target(sdev); > + if (atomic_xchg(&starget->lun_change_reported, 1) == 0) > + schedule_delayed_work(&starget->ua_dwork, 2*HZ); > + sdev_printk(KERN_WARNING, sdev, > + "Reported LUNs data has changed"); > +#else > + sdev_printk(KERN_WARNING, sdev, > + "Warning! Received an indication that the " > + "LUN assignments on this target have " > + "changed. The Linux SCSI layer does not " > + "automatically remap LUN assignments.\n"); > +#endif As James B. said, I would not modify the existing comment. Just because we did send an event that doesn't mean that someone actually too some action for that event ... So I'd prefer to have the original messages in place. > + } else if (sshdr->asc == 0x3f) > +#ifdef CONFIG_SCSI_ENHANCED_UA > + sdev_printk(KERN_WARNING, sdev, > + "Target operating conditions have changed"); > +#else > + sdev_printk(KERN_WARNING, sdev, > + "Warning! Received an indication that the " > + "operating parameters on this target have " > + "changed. The Linux SCSI layer does not " > + "automatically adjust these parameters.\n"); > +#endif > + > +
Re: [PATCH] LIBISCSI: Added new boot entries in the session sysfs
On 06/10/2013 01:34 PM, Eddie Wai wrote: > index fe7f06c..f9cc89b 100644 > --- a/include/scsi/iscsi_if.h > +++ b/include/scsi/iscsi_if.h > @@ -487,6 +487,10 @@ enum iscsi_param { > ISCSI_PARAM_TGT_RESET_TMO, > ISCSI_PARAM_TARGET_ALIAS, > > + ISCSI_PARAM_BOOT_ROOT, > + ISCSI_PARAM_BOOT_NIC, > + ISCSI_PARAM_BOOT_TARGET, > + > ISCSI_PARAM_CHAP_IN_IDX, > ISCSI_PARAM_CHAP_OUT_IDX, > /* must always be last */ We cannot add in the middle because it would screw up older tools and kernels. Just add adter the CHAP ones at the end. Other than that patch looks ok. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] LIBISCSI: Added new boot entries in the session sysfs
v2: Moved the new BOOT params to the end of the iscsi_param enum This is the kernel part of the modification to extract the net params from the ibft sysfs to the iface struct used for the connection request upon sync_session in the open-iscsi util. Three new session sysfs params are defined: boot_root - holds the name of the /sys/firmware/ibft or iscsi_rootN boot_nic - holds the ethernetN name boot_target - holds the targetN name Signed-off-by: Eddie Wai --- drivers/scsi/libiscsi.c | 18 ++ drivers/scsi/scsi_transport_iscsi.c | 12 include/scsi/iscsi_if.h |5 + include/scsi/libiscsi.h |4 4 files changed, 39 insertions(+), 0 deletions(-) diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index 5de9469..ae69dfc 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -2808,6 +2808,9 @@ void iscsi_session_teardown(struct iscsi_cls_session *cls_session) kfree(session->targetname); kfree(session->targetalias); kfree(session->initiatorname); + kfree(session->boot_root); + kfree(session->boot_nic); + kfree(session->boot_target); kfree(session->ifacename); iscsi_destroy_session(cls_session); @@ -3248,6 +3251,12 @@ int iscsi_set_param(struct iscsi_cls_conn *cls_conn, return iscsi_switch_str_param(&session->ifacename, buf); case ISCSI_PARAM_INITIATOR_NAME: return iscsi_switch_str_param(&session->initiatorname, buf); + case ISCSI_PARAM_BOOT_ROOT: + return iscsi_switch_str_param(&session->boot_root, buf); + case ISCSI_PARAM_BOOT_NIC: + return iscsi_switch_str_param(&session->boot_nic, buf); + case ISCSI_PARAM_BOOT_TARGET: + return iscsi_switch_str_param(&session->boot_target, buf); default: return -ENOSYS; } @@ -3326,6 +3335,15 @@ int iscsi_session_get_param(struct iscsi_cls_session *cls_session, case ISCSI_PARAM_INITIATOR_NAME: len = sprintf(buf, "%s\n", session->initiatorname); break; + case ISCSI_PARAM_BOOT_ROOT: + len = sprintf(buf, "%s\n", session->boot_root); + break; + case ISCSI_PARAM_BOOT_NIC: + len = sprintf(buf, "%s\n", session->boot_nic); + break; + case ISCSI_PARAM_BOOT_TARGET: + len = sprintf(buf, "%s\n", session->boot_target); + break; default: return -ENOSYS; } diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index 133926b..abf7c40 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -3473,6 +3473,9 @@ iscsi_session_attr(tgt_reset_tmo, ISCSI_PARAM_TGT_RESET_TMO, 0); iscsi_session_attr(ifacename, ISCSI_PARAM_IFACE_NAME, 0); iscsi_session_attr(initiatorname, ISCSI_PARAM_INITIATOR_NAME, 0); iscsi_session_attr(targetalias, ISCSI_PARAM_TARGET_ALIAS, 0); +iscsi_session_attr(boot_root, ISCSI_PARAM_BOOT_ROOT, 0); +iscsi_session_attr(boot_nic, ISCSI_PARAM_BOOT_NIC, 0); +iscsi_session_attr(boot_target, ISCSI_PARAM_BOOT_TARGET, 0); static ssize_t show_priv_session_state(struct device *dev, struct device_attribute *attr, @@ -3568,6 +3571,9 @@ static struct attribute *iscsi_session_attrs[] = { &dev_attr_sess_ifacename.attr, &dev_attr_sess_initiatorname.attr, &dev_attr_sess_targetalias.attr, + &dev_attr_sess_boot_root.attr, + &dev_attr_sess_boot_nic.attr, + &dev_attr_sess_boot_target.attr, &dev_attr_priv_sess_recovery_tmo.attr, &dev_attr_priv_sess_state.attr, &dev_attr_priv_sess_creator.attr, @@ -3631,6 +3637,12 @@ static umode_t iscsi_session_attr_is_visible(struct kobject *kobj, param = ISCSI_PARAM_INITIATOR_NAME; else if (attr == &dev_attr_sess_targetalias.attr) param = ISCSI_PARAM_TARGET_ALIAS; + else if (attr == &dev_attr_sess_boot_root.attr) + param = ISCSI_PARAM_BOOT_ROOT; + else if (attr == &dev_attr_sess_boot_nic.attr) + param = ISCSI_PARAM_BOOT_NIC; + else if (attr == &dev_attr_sess_boot_target.attr) + param = ISCSI_PARAM_BOOT_TARGET; else if (attr == &dev_attr_priv_sess_recovery_tmo.attr) return S_IRUGO | S_IWUSR; else if (attr == &dev_attr_priv_sess_state.attr) diff --git a/include/scsi/iscsi_if.h b/include/scsi/iscsi_if.h index fe7f06c..9d28ded 100644 --- a/include/scsi/iscsi_if.h +++ b/include/scsi/iscsi_if.h @@ -489,6 +489,11 @@ enum iscsi_param { ISCSI_PARAM_CHAP_IN_IDX, ISCSI_PARAM_CHAP_OUT_IDX, + + ISCSI_PARAM_BOOT_ROOT, + ISCSI_PARAM_BOOT_NIC, + ISCSI_PARAM_BOOT_TARGET, + /* must always be last */ ISCSI_PARAM_MAX, }; diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h index 0
Re: [PATCH v2] LIBISCSI: Added new boot entries in the session sysfs
On 06/20/2013 12:21 PM, Eddie Wai wrote: > v2: Moved the new BOOT params to the end of the iscsi_param enum > > This is the kernel part of the modification to extract the net params > from the ibft sysfs to the iface struct used for the connection > request upon sync_session in the open-iscsi util. > > Three new session sysfs params are defined: > boot_root - holds the name of the /sys/firmware/ibft or iscsi_rootN > boot_nic - holds the ethernetN name > boot_target - holds the targetN name > > Signed-off-by: Eddie Wai Ok. Reviewed-by: Mike Christie -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RFC: Allow block drivers to poll for I/O instead of sleeping
A paper at FAST2012 (http://static.usenix.org/events/fast12/tech/full_papers/Yang.pdf) pointed out the performance overhead of taking interrupts for low-latency block I/Os. The solution the author investigated was to spin waiting for each I/O to complete. This is inefficient as Linux submits many I/Os which are not latency-sensitive, and even when we do submit latency-sensitive I/Os (eg swap-in), we frequently submit several I/Os before waiting. This RFC takes a different approach, only spinning when we would otherwise sleep. To implement this, I add an 'io_poll' function pointer to backing_dev_info. I include a sample implementation for the NVMe driver. Next, I add an io_wait() function which will call io_poll() if it is set. It falls back to calling io_schedule() if anything goes wrong with io_poll() or the task exceeds its timeslice. Finally, all that is left is to judiciously replace calls to io_schedule() with calls to io_wait(). I think I've covered the main contenders with sleep_on_page(), sleep_on_buffer() and the DIO path. I've measured the performance benefits of this with a Chatham NVMe prototype device and a simple # dd if=/dev/nvme0n1 of=/dev/null iflag=direct bs=512 count=100 The latency of each I/O reduces by about 2.5us (from around 8.0us to around 5.5us). This matches up quite well with the performance numbers shown in the FAST2012 paper (which used a similar device). Is backing_dev_info the right place for this function pointer? Have I made any bad assumptions about the correct way to retrieve the backing_dev_info from (eg) a struct page, buffer_head or kiocb? Should I pass a 'state' into io_wait() instead of retrieving it from 'current'? Is kernel/sched/core.c the right place for io_wait()? Should it be bdi_wait() instead? Should it be marked with __sched? Where should I add documentation for the io_poll() function pointer? NB: The NVMe driver piece of this is for illustrative purposes only and should not be applied. I've rearranged the diff so that the interesting pieces appear first. diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h index c388155..97f8fde 100644 --- a/include/linux/backing-dev.h +++ b/include/linux/backing-dev.h @@ -68,6 +68,7 @@ struct backing_dev_info { unsigned int capabilities; /* Device capabilities */ congested_fn *congested_fn; /* Function pointer if device is md/dm */ void *congested_data; /* Pointer to aux data for congested func */ + int (*io_poll)(struct backing_dev_info *); char *name; @@ -126,6 +127,8 @@ int bdi_has_dirty_io(struct backing_dev_info *bdi); void bdi_wakeup_thread_delayed(struct backing_dev_info *bdi); void bdi_lock_two(struct bdi_writeback *wb1, struct bdi_writeback *wb2); +void io_wait(struct backing_dev_info *bdi); + extern spinlock_t bdi_lock; extern struct list_head bdi_list; diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 58453b8..4840065 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4527,6 +4527,36 @@ long __sched io_schedule_timeout(long timeout) return ret; } +/* + * Wait for an I/O to complete against this backing_dev_info. If the + * task exhausts its timeslice polling for completions, call io_schedule() + * anyway. If a signal comes pending, return so the task can handle it. + * If the io_poll returns an error, give up and call io_schedule(), but + * swallow the error. We may miss an I/O completion (eg if the interrupt + * handler gets to it first). Guard against this possibility by returning + * if we've been set back to TASK_RUNNING. + */ +void io_wait(struct backing_dev_info *bdi) +{ + if (bdi && bdi->io_poll) { + long state = current->state; + while (!need_resched()) { + int ret = bdi->io_poll(bdi); + if ((ret > 0) || signal_pending_state(state, current)) { + set_current_state(TASK_RUNNING); + return; + } + if (current->state == TASK_RUNNING) + return; + if (ret < 0) + break; + cpu_relax(); + } + } + + io_schedule(); +} + /** * sys_sched_get_priority_max - return maximum RT priority. * @policy: scheduling class. diff --git a/fs/aio.c b/fs/aio.c index 2bbcacf..7b20397 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -453,11 +453,15 @@ static void kill_ioctx(struct kioctx *ctx) */ ssize_t wait_on_sync_kiocb(struct kiocb *iocb) { + struct backing_dev_info *bdi = NULL; + + if (iocb->ki_filp) + bdi = iocb->ki_filp->f_mapping->backing_dev_info; while (atomic_read(&iocb->ki_users)) { set_current_state(TASK_UNINTERRUPTIBLE); if (!atomic_read(&iocb->ki_users)) break; - io_schedule(); +